Hallucination

Hallucination is the most dangerous failure mode of LLMs in production. The model doesn't "know" it's making something up — it's generating the most probable next token given its context, and sometimes the most probable sequence is factually wrong. This is especially problematic because hallucinated content often sounds confident and well-written.

Common hallucination triggers include questions about specific facts (dates, numbers, names), topics with limited training data, requests that push beyond the model's knowledge, and prompts that implicitly encourage the model to guess rather than admit uncertainty. In growth applications, hallucinations can erode user trust in seconds — imagine a support bot confidently giving wrong pricing information.

Mitigation strategies include RAG (grounding responses in real data), explicit instructions to say "I don't know," output validation against known facts, temperature reduction for factual tasks, and citation requirements that force the model to reference its sources. The most robust approach is defense in depth: multiple layers of validation between the LLM's output and what the user sees.

Related Terms

LLM (Large Language Model)

RAG (Retrieval-Augmented Generation)

Prompt Engineering

Embeddings

Vector Database

Fine-Tuning

Further Reading

5 Common RAG Pipeline Mistakes (And How to Fix Them)

Prompt Engineering in 2026: What Actually Works