Methodology · Playground · Last updated 2026-04-20
How Hallucination Detector works
How the Hallucination Detector tool actually works — assumptions, algorithms, limitations.
Scope
This tool detects numeric-class hallucinations in an LLM's extraction of a source document. Numbers are the highest-value class to catch in financial settings — fabricated revenue, invented period dates, made-up growth rates, and invented percentages cause directly actionable bad decisions.
Prose-level fabrication (an LLM inventing a risk factor that isn't in the document, or fabricating a narrative attribution) is not detected. A future iteration will layer an embedding-based grounding pass on top of this numeric check.
Claim extraction regex layer
| Kind | Regex (illustrative) |
|---|---|
| Currency | [$€£]|USD|EUR|GBP + digit(s) + optional suffix (B/M/K/bn/million/thousand) |
| Percent | -?\d+(\.\d+)?% |
| Date | Q[1-4] YYYY | FY YYYY | YYYY-MM-DD | bare 4-digit year 20NN |
| Number | grouped (1,234) or >= 4-digit numbers |
Claims are deduplicated by (kind, normalized-value).
Grounding check
- Direct substring match. Normalized claim present verbatim in source → grounded.
- Numeric proximity. For currency + number, magnitude suffixes are expanded (B → ×10⁹, M → ×10⁶, K → ×10³). Compared to every numeric token in the source. If any is within ±1% of target → grounded. Otherwise the closest value is recorded as
nearestfor user inspection. - Date fallback. Any 4-digit year substring of the claim found in the source → grounded.
Grounding score
Score = grounded_claims / total_claims × 100%. Traffic-light tones:
- ≥ 90%: green (acceptable)
- 70–89%: amber (inspect manually)
- < 70%: red (probable fabrication)
Limitations
- No prose grounding. A hallucinated causal claim without embedded numbers is invisible to this tool.
- Number unit inference is basic. "$2,847 million" vs "2.847 billion" is handled; exotic units (basis points, millis, thousands-of-thousands) may miss.
- Within-1% tolerance is a heuristic. For true equality (e.g. exact share counts) this is too loose; for approximations (e.g. rounded margins) it's fine. Override on a case-by-case basis.
- Locale. Comma thousands-separator only. European
.-thousands /,-decimal not supported yet. - Context-aware grounding. A claim can be grounded on the number but wrong on its attribution (LLM says revenue when source reports cost of goods). This tool will mark it grounded; you still need to read carefully.