Hallucination Detection
Hallucination: an LLM output that is fluent, plausible, and wrong. Detection methods: (1) citation grounding — every factual claim is traceable to a source document; (2) cross-model voting — the same query routed to multiple models, disagreement flagged for review; (3) verifiable-claim extraction — extract structured claims, check each against a source of truth; (4) confidence calibration — ask the model for confidence and validate against actual accuracy.
On This Page
Definition
Hallucination detection
Hallucination: an LLM output that is fluent, plausible, and wrong. Detection methods: (1) citation grounding — every factual claim is traceable to a source document; (2) cross-model voting — the same query routed to multiple models, disagreement flagged for review; (3) verifiable-claim extraction — extract structured claims, check each against a source of truth; (4) confidence calibration — ask the model for confidence and validate against actual accuracy.
Why it matters
Hallucinated price levels, fabricated SEC filing details, invented index members, or made-up paper citations are all things LLMs do confidently. In a trading agent the cost of an hallucinated input is real money. Detection is necessary, not optional, on every model output that affects a decision.
How it works
Force structured output. For each factual claim, require a citation pointing to a chunk of the source document. Verify the citation actually contains the claim. For unstructured outputs, route the same query through 2-3 models and flag any factual disagreement for human review. Track the model's calibration — claims at 90% reported confidence should be wrong roughly 10% of the time; if the calibration drifts, flag the model.
Example
Agent summarizes 10-K, asked for FY revenue
Model output
FY revenue was $42.3B
Actual filing
$41.8B
Cited chunk contains
$41.8B
Citation-grounding check
FAIL (model output ≠ cited value)
Without citation-grounding, the $42.3B number flows downstream. With it, the mismatch is caught at extraction time, before the number reaches a trading rule.
Key Takeaways
Always require citations for factual claims; verify the citation programmatically.
Cross-model disagreement is a strong hallucination signal.
Calibration drift — model becoming overconfident — is a leading indicator of model degradation.
Related Terms
Try These Tools
Run the numbers next
Hallucination Detector
Paste a source document + an LLM's extraction. Every numeric claim in the output is checked against the source. Client-side. Catches silent fabrication.
Agent Skill Tester for Markets
Paste a SKILL.md definition + sample input + your Anthropic API key. See structured extraction, token cost, and latency — all in your browser. No signup.
Prompt Regression Tester
Run the same prompt against multiple models (Claude 4.5/4.6/4.7, GPT-5, Gemini 2.5) with your own keys. Diff outputs, score drift, catch regressions.
FAQ
Questions people ask next
The short answers readers usually want after the first pass.
Sources & References
- Survey of Hallucination in Natural Language Generation — Ji et al. (2022), arXiv:2202.03629
Related Content
Keep the topic connected
Model Drift
Model drift: when an LLM's behavior changes between calls, versions, or weeks. The monitoring stack that catches it before production breaks.
Agent Skill Testing
Agent skill testing: the regression-test discipline for LLM-driven agents. What to test, how to score, and the difference between pass-rate and capability.