Subject 10

Hallucination reduction

Hallucinations are outputs that sound plausible but are unsupported, incorrect, or fabricated. Reducing them matters whenever an LLM is expected to be dependable rather than merely fluent.

Beginner

Models do not retrieve truth the way a database does. They predict likely next tokens. That makes them good at producing fluent answers, but fluency can hide the fact that a claim is unsupported. Hallucination reduction is the discipline of making the system prefer grounded, checkable, and sometimes incomplete answers over polished guesses.

What counts as a hallucination?

What hallucination is not

Issue What it means Why the distinction matters
Hallucination The output contains unsupported or fabricated content. You need grounding, checking, or abstention.
Outdated knowledge The answer may have been true once but is no longer current. You need fresher evidence, not just a stronger prompt.
Reasoning error The facts are present, but the model combines them incorrectly. You need step checks, decomposition, or task-specific validators.

Why hallucinations happen

First-line reduction tactics

Real-world example: an internal assistant is asked for the current expense reimbursement limit. If the model cannot find a trusted policy excerpt, the safe behavior is to abstain or escalate, not to guess a number that sounds normal.

def build_safe_prompt(question, evidence):
    return f"""
Answer the question using only the evidence below.
If the evidence is insufficient, reply exactly: I do not have enough evidence.

Question: {question}
Evidence:
{evidence}
"""

print(build_safe_prompt(
    "What is the reimbursement cap?",
    "Policy excerpt: travel meals are reimbursed up to the approved daily allowance."
))

A model that says "I do not know" at the right time is usually more trustworthy than a model that answers every question smoothly.

Intermediate

Effective hallucination reduction usually comes from a layered design. You prevent unsupported claims when possible, detect them when prevention fails, and recover safely when uncertainty stays high.

A practical defense stack

Layer Goal Typical technique
Prevention Reduce the chance of unsupported content appearing. Grounded prompts, narrower tasks, explicit abstention rules.
Detection Find claims that exceed the available support. Sentence-level review, claim extraction, consistency checks.
Recovery Return a safer response when support is weak. Abstain, ask a clarifying question, escalate to human review.

Claim-level thinking

Do not judge a long answer only at the paragraph level. A response can be mostly good but contain one invented number, one fake citation, or one unjustified causal statement. In practice, important claims should be isolated and checked individually.

def unsupported_claims(claims, evidence_text):
    unsupported = []
    normalized_evidence = evidence_text.lower()

    for claim in claims:
        if claim.lower() not in normalized_evidence:
            unsupported.append(claim)

    return unsupported

claims = [
    "Employees may carry over five vacation days",
    "Unused sick leave is paid out annually"
]

evidence = "Employees may carry over five vacation days into the next calendar year."

print(unsupported_claims(claims, evidence))

Common failure patterns and fixes

User request
  -> classify risk and ambiguity
  -> answer only within allowed scope
  -> attach support for each important claim
  -> verify or flag unsupported content
  -> answer, abstain, or escalate

Keep this topic separate from RAG mechanics, search quality, and vector database design. Those systems may provide evidence, but hallucination reduction is about how the model behaves when evidence is present, weak, conflicting, or absent.

Advanced

At advanced maturity, hallucination reduction becomes a selective-generation problem: the system should answer when support is strong, refuse when support is weak, and surface uncertainty in ways the product and user can act on. The goal is not merely fewer false statements. The goal is better decision behavior under uncertainty.

High-value advanced patterns

Decision gating

def should_answer(evidence_score, verifier_passed, high_risk=False):
	if high_risk and not verifier_passed:
		return False
	if evidence_score < 0.75:
		return False
	return verifier_passed

decision = should_answer(
	evidence_score=0.68,
	verifier_passed=False,
	high_risk=True,
)

print(decision)

What strong systems optimize for

Safe answer policy

1. Is the question within scope?
2. Is there enough trusted support?
3. Does each critical claim have backing?
4. Did a verifier or rule check pass?
5. If any answer is no: abstain, clarify, or escalate

One recurring mistake is treating the model's own confidence as if it were a calibrated probability of truth. It is not. Self-reported confidence can be useful as a weak feature, but external support and post-generation checks are much stronger signals.

Another mistake is optimizing only for helpfulness. If the product punishes abstention too aggressively, the system will learn to answer uncertain questions anyway. Good hallucination reduction accepts a trade-off: sometimes the correct output is a refusal, a follow-up question, or a handoff.

A lower hallucination rate does not mean the system is safe in every context. High-stakes tasks still need domain rules, approval paths, and audited failure analysis.

To-do list

Learn

  • Understand the difference between fabricated content, outdated content, and reasoning mistakes.
  • Learn why next-token prediction can reward plausibility even when support is missing.
  • Study abstention, clarification, and escalation as core safety behaviors.
  • Learn why claim-level support is stronger than general "confidence" wording.

Practice

  • Write prompts that require the model to abstain when the answer is unsupported.
  • Take ten generated answers and mark which individual claims are supported versus unsupported.
  • Test ambiguous questions and compare direct answering against clarification-first behavior.
  • Inspect citations manually and note where a citation looks relevant but does not justify the exact claim.

Build

  • Build a small guardrail that classifies answers as supported, unsupported, or needs review.
  • Add a verification step that checks critical claims before the final answer is shown.
  • Create an error analysis sheet that groups failures into fabricated facts, fake citations, and unsupported synthesis.
  • Build a response template that always includes answer, support, uncertainty, and escalation path.