AI Safety & Ethics LLM & Generative AI

Hallucination

Difficulty

Plain Explanation

Hallucination is when an AI gives an answer that sounds fluent but is not supported by the facts, the source document, or the task constraints. The dangerous part is that the answer often sounds confident, so readers may trust it before checking the evidence.

A useful analogy is a presenter filling gaps in their notes. The speech may be smooth, but when you compare it with the slide deck, some details were invented or stitched together from weak memory.

The key idea is not that the model is intentionally lying. It is predicting plausible text under uncertainty. Production systems therefore need grounding, verification, abstention, and review paths rather than only nicer wording.

Examples & Analogies

A support chatbot invents a refund policy that is not in the company handbook. The tone is helpful, but the claim is unsupported.
A research assistant summarizes a paper and adds a result the paper never reported. This is especially risky when the sentence looks like a citation.
A coding assistant suggests a library option that does not exist. Tests or official docs are needed before using it.
In healthcare, law, finance, and security workflows, the safest answer may be to refuse, ask for more context, or route to a human reviewer.

At a Glance

Type	What goes wrong	Typical mitigation
Missing-knowledge hallucination	The model lacks the needed fact or has stale knowledge	Retrieval, fresh sources, source comparison
Grounding failure	Sources exist, but the answer drifts away from them	Sentence-level citation checks, evidence mapping
Reasoning failure	An intermediate step is wrong and the conclusion follows it	Tests, rule-based verifiers, independent solution paths
Overconfident uncertainty	The model states an uncertain answer as fact	Confidence labels, abstention rules, human review

Where and Why It Matters

News and research summaries lose credibility if the system adds facts that were not in the source material.
Enterprise chatbots can create real operational risk by misstating prices, policies, contracts, or compliance obligations.
Developer tools waste time when they invent functions, flags, or package behavior.
Agent systems raise the stakes because a hallucinated instruction can trigger a real tool action, such as sending an email or updating a database.
Hallucination control is therefore a system design problem: retrieval quality, validation, logging, permissions, and escalation matter as much as the base model.

Common Misconceptions

❌ Myth: Hallucination only happens when the model sounds unsure. → ✅ Reality: the riskiest cases often sound polished and confident.
❌ Myth: Adding RAG eliminates hallucination. → ✅ Reality: retrieval can fail, sources can be stale, and the model can still misread relevant evidence.
❌ Myth: Lower temperature is enough. → ✅ Reality: lower randomness does not fix missing knowledge or unsupported claims.
❌ Myth: Longer answers are more trustworthy. → ✅ Reality: longer answers contain more claims that must be checked.

How It Sounds in Conversation

"Which source sentence supports this claim? If none, the answer should abstain."
"Do not let the model invent policy details. Compare every policy claim with the handbook."
"This looks fluent, but the citation does not actually support the sentence."
"For code suggestions, require either a passing test or a link to official docs."
"Track hallucination by failure type, not just one aggregate score."

References

★Paper2024
AI hallucination: towards a comprehensive classification of distorted outputs
Classifies distorted AI outputs and terminology for hallucination failure types.
★Paper2025
Explainable Hallucination Mitigation in Large Language Models: A Survey
Surveys mitigation strategies and explainability-oriented diagnosis methods.
★Paper2024
Mechanistic Understanding and Mitigation of Language Model Non-Factual Hallucinations
Discusses internal mechanisms behind non-factual generations and mitigation directions.
·Blog
What Are AI Hallucinations?
Plain-language overview of causes, examples, and risk controls.

Helpful?

0to1log Weekly

AI Glossary