LLMs aren't Nirvana
#llm #AI
The Nirvana Fallacy is a fallacy that dismisses something on the basis of not holding up to an unrealistic alternative. I see this kind of argument a lot when it comes to LLM - a casual scroll through Mastodon is enough to uncover a handful of things that LLM can't do well, or at all, which serves as an explanation why LLM are overhyped at best, and useless at worst.
The canonical example are hallucinations: Yes, LLMs hallucinate, which implies that should we use the output of LLM without supervision in, say, life-and-death scenarios, and should the LLM hallucinate then, the result will be disastrous. No argument there form my side.
True as this implication may be, dismissing LLMs as a whole because it doesn't fit this idealised view on how LLMs (allegedly) should work distracts from its immense use elsewhere.
Two close family members have told me from their recent experience with LLMs. Both family members are competent tech users, but nothing more. To oversimplify it, they know enough to be a responsible admin of a local WhatsApp group, but they would probably call the act of someone stealing their crappy password "I was being hacked!". Both have zero interest whatsoever in becoming more than this.
This means that they are pretty detached from my regular "tech bubble", so when they talk about how they interact with technology, I listen[1] with what I hope is child-like curiosity.
The first case is a family member who works in law. The way law works in Germany is roughly this: You have some law somewhere. You may have other laws elsewhere which supersedes this law. You have a prevailing opinion on how to apply the law. And you have a huge amount of court rulings around said law. When working with laws, you need to memorise a lot, and you need to know where to look in order to find pointers to other places to look.
One of Germany's biggest publisher of law-related texts is now providing an AI research assistant, which is trained exactly on all of the aforementioned things - because no matter how complicated a law is, you can fully 100% rely on things being written down, making it an excellent use case for LLMs. So the AI does the heavy lifting, surfacing rulings and texts to serve as a starting point for the actual specialist to go from. According to my family member, this is absolutely huge as all these researches are typically mind-numbing and take a lot of time, so the LLMs offer actual practical benefits here.
The second case is about some kind of health-realted meeting notes. Let me explain.
My brother is a therapist, and as part of his work he talks to a lot of patients in a clinic. One specific piece of his work is to conduct the first interview with a potential patient, which serves, as far as I understand, as a first indicator for potential mental health issues. These interviews aren't fully free-form, he asks specific structured questions in order to uncover specific things - which makes sense, his findings aren't based on gut feelings, and other therapists have to be able to follow his train of thoughts and how he drew certain conclusions at a certain time. However, budget is tight, time is short, and due to the nature of his work it's common that he has to hurry from one appointment to the next, to put out some fires. If that happens, a lot of time may pass between conducting the interview and typing a report based on his notes and the interview records.
That's where the LLM comes in. It works by listening to the recording of the interview. It has been trained specifically on mental health research and texts, it knows the structured questionnaires and it then produces a preliminary report my brother can use as a starting point. He says that even if the LLM had a 30% error rate (which it doesn't have, according to him), fixing the errors would still be less work than typing up all the reports from scratch. Even more, the LLM is able to catch if my brother forgot to ask a certain question, or should have pursued a specific angle.
While the first example - with the law texts - is mostly about saving time, the second one is actually improving the quality of the output of the specialists. There is no way, says my brother, that he would be able to deliver that level of quality consistently, unless he only had one interview per day and the remaining hours wholly dedicated to creating the report.
I am by no means an AI evangelist or anything, but these examples right there show just how much of a technological advancement LLMs are.
I've started this habit back in the day when I started out being a product manager. A pivotal experience was the following: At work, we were launching a completely new cell phone network, and we were hotly debating whether or not to include MMS. This was at a time where SMS was gradually being replaced by WhatsApp. Talking to people within my bubble brought me a wide variety of opinions. Talking to people outside of my bubble, like these two family members, invariably brought me the answer: "What's MMS?". We launched without MMS, and there were zero customer complaints. ↩︎