The Curious Case of AI Hallucination

So, earlier today, I was having coffee with a friend who’s a technology program manager at a large financial services firm and fellow nerd – let’s call him Steve (since that’s his name). Given our shared interests, the discussion eventually turned to generative AI and, in particular, Large Language Models (LLM) like ChatGPT. Steve is a technologist like me but hasn’t had a lot of exposure to LLMs beyond what he’s seen on Medium and in YouTube videos, and he was asking the predictable questions – “Is this stuff for real? Can it do all the things they’re saying it can? What’s up with the whole hallucination thing?”

Overall, it was a great discussion, and we covered a broad range of topics. In the end, I think we both came away from the conversation with interesting perspectives on the potential for generative AI as well as some of the potential shortcomings.

I plan to share my perspective on our discussion in a series of posts over the next few weeks, beginning with the first topic we discussed — AI Hallucination. I hope you find value in these perspectives and, as always, comments are most welcome.

So, what’s up with AI hallucination?

For AI, hallucination refers to when an LLM generates text that sounds plausible but is factually incorrect or unsupported by evidence. For example, if you prompted an LLM to “write about the first American woman in space,” it might pull plausible-sounding information from its vast training data but get the facts wrong by hallucinating fictional details and attributing them to the first female American astronaut. This tendency of large language models to confidently generate fake details and pass them off as truthful accounts when prompted for topics outside its actual training data is extremely problematic, especially if the user is unaware of this tendency and takes the output at face value.

When I say “tendency”, I mean this is a very common issue that arises frequently with large language models today. The propensity to hallucinate false details with high confidence is extremely prevalent in modern LLMs, even sophisticated models trained on huge datasets. For example, a 2021 study from Anthropic found that LLMs hallucinated over 40% of the time when asked simple factual questions from datasets they were not trained on. And openAI has warned that its GPT models “sometimes write plausible-sounding but incorrect or nonsensical text” and should not be relied upon for factual accuracy without oversight.

This is especially dangerous in high-stakes fields like medicine or law, and in fact, there is a recent story of a lawyer using an LLM to prepare a court filing, inadvertently referencing fake cases (fortunately, the court caught this prior to the case proceeding).

As to why LLMs hallucinate, there are several potential reasons:

They are trained on limited data that does not cover all possible topics, so they try to fill in gaps.
Their goal is to generate coherent, fluent text, not necessarily accurate
They lack grounding in common sense or the real world.
Their statistical nature means they will occasionally sample incorrect or imaginary information.

An important point is that the LLM does not intentionally construct false information (unless asked to do so); rather it builds its responses based on available data (the data it was trained on). The models attempt to continue patterns and maintain internal coherence in their generated text, which can result in persuasive but false details when their knowledge is imperfect or they are asked to extrapolate beyond their training. In some ways, this exacerbates the problem, as the model can respond with high confidence while, in fact, having no factual basis for the response. Perhaps more worrisome, with further scaling up of models, this tendency may only become more pronounced as they get better at producing persuasive human-like text.

Clearly, better techniques are needed to detect and reduce hallucination.

There are some approaches that are being explored to reduce the occurrence of hallucination and/or to correct it prior producing generated responses. Here are some of the techniques being explored by researchers:

Human Feedback with Reinforcement Learning (HFRL): Having humans flag hallucinated text during the training process, then using Reinforcement Learning (RL) to adjust the model to reduce false information.
Incorporating Knowledge Bases: Connecting the LLM to an external knowledge base like Wikipedia can ground its output in facts.
Causal Modeling: Modeling cause-and-effect relationships helps the LLM better understand interactions in the real world.
Self-Consistency: Penalizing the model when its predictions contradict each other can minimize internal inconsistencies.
Robust Question Answering: Training the model to carefully consider a question before answering reduces speculative responses.
Hallucination Detection Systems: Separate classifiers can be developed specifically to detect hallucinated text.
Retrieval Augmented Generation (RAG): Retrieving relevant text and data before generating from them improves grounding.
Human-in-the-Loop: Letting humans interactively guide the model during text generation can steer it away from hallucination.

Which solution(s) will perform best is determined in part by the particular use case (for example, HFRL might not be practical for very large datasets) and more than likely a combination of techniques will be required to achieve desired levels of confidence in responses.

Even with these additional controls and safeguards in place, it will continue to be important to perform some level of quality control prior to using LLM output.

As a thought experiment, let’s take a private equity firm — the firm wishes to use LLMs to streamline the summarization and analysis of corporate data for acquisition targets. Indeed, LLMs can provide significant productivity lift in consuming and condensing large volumes of structured and unstructured data, and the firm can certainly use an appropriately fine-tuned LLM to facilitate the process of analyzing an organization’s fitness for acquisition. Having said that, fact-checking any specific conclusions produced by the LLM must be scrutinized closely to ensure its veracity prior to use in decision-making and, where necessary, adjustments made. Note that this should be no different than the same level of scrutiny that would be applied to human-generated analysis; the point is to not make the assumption that because the analysis is ‘computer generated’ that it is somehow more reliable – in fact, the opposite is true.

All said, hallucination remains a significant obstacle to leveraging the full power and potential of large language models. But proper controls, along with continued research into techniques like the ones discussed here provides a pathway for leveraging LLMs to generate accurate, trustworthy text as easily as they currently produce fluent, creative text.

If you’re ready to take advantage of AI in a meaningful way but want to avoid the growing pains and pitfalls (including hallucinations), we should talk! Our 5-day AI assessment takes the guesswork out of maximizing the value of AI while minimizing the risks associated with LLMs. You can find out more about this offering here or connect with me on LinkedIn.

(Note: Artwork for this and subsequent posts in this series are part of my collection, produced by MidJourney. Lined here.)