aifinhub

Methodology · Playground · Last updated 2026-04-20

How Hallucination Detector works

How the Hallucination Detector tool actually works — assumptions, algorithms, limitations.

Scope

This tool detects numeric-class hallucinations in an LLM's extraction of a source document. Numbers are the highest-value class to catch in financial settings — fabricated revenue, invented period dates, made-up growth rates, and invented percentages cause directly actionable bad decisions.

Prose-level fabrication (an LLM inventing a risk factor that isn't in the document, or fabricating a narrative attribution) is not detected. A future iteration will layer an embedding-based grounding pass on top of this numeric check.

Claim extraction regex layer

KindRegex (illustrative)
Currency[$€£]|USD|EUR|GBP + digit(s) + optional suffix (B/M/K/bn/million/thousand)
Percent-?\d+(\.\d+)?%
DateQ[1-4] YYYY | FY YYYY | YYYY-MM-DD | bare 4-digit year 20NN
Numbergrouped (1,234) or >= 4-digit numbers

Claims are deduplicated by (kind, normalized-value).

Grounding check

  1. Direct substring match. Normalized claim present verbatim in source → grounded.
  2. Numeric proximity. For currency + number, magnitude suffixes are expanded (B → ×10⁹, M → ×10⁶, K → ×10³). Compared to every numeric token in the source. If any is within ±1% of target → grounded. Otherwise the closest value is recorded as nearest for user inspection.
  3. Date fallback. Any 4-digit year substring of the claim found in the source → grounded.

Grounding score

Score = grounded_claims / total_claims × 100%. Traffic-light tones:

  • ≥ 90%: green (acceptable)
  • 70–89%: amber (inspect manually)
  • < 70%: red (probable fabrication)

Limitations

  1. No prose grounding. A hallucinated causal claim without embedded numbers is invisible to this tool.
  2. Number unit inference is basic. "$2,847 million" vs "2.847 billion" is handled; exotic units (basis points, millis, thousands-of-thousands) may miss.
  3. Within-1% tolerance is a heuristic. For true equality (e.g. exact share counts) this is too loose; for approximations (e.g. rounded margins) it's fine. Override on a case-by-case basis.
  4. Locale. Comma thousands-separator only. European .-thousands / ,-decimal not supported yet.
  5. Context-aware grounding. A claim can be grounded on the number but wrong on its attribution (LLM says revenue when source reports cost of goods). This tool will mark it grounded; you still need to read carefully.
Planning estimates only — not financial, tax, or investment advice.