Playground
Hallucination Detector for Financial Extractions
Paste source + LLM extraction. Every numeric claim is cross-checked against the source; ungrounded claims are flagged. Runs in your browser. Free.
- Inputs
- Source document + LLM extraction
- Runtime
- Instant
- Privacy
- Client-side · no upload
- API key
- Not required
- Methodology
- Open →
Claims detected
7
Grounded
2
Not grounded
5
Grounding score
29%
1 · Source document
2 · LLM extraction / output
3 · Markup
greengroundedrosenot grounded — likely hallucination
Ungrounded claims
- currency
$2,847 million - percent
18.1% - currency
$412 millionnearest in source:2025 - currency
$780 millionnearest in source:2025 - percent
35%
How grounding is checked
- · Numbers in the output are extracted (currencies, plain numbers ≥ 1000, percents, dates).
- · Each number is checked for direct substring presence in the source, then for within-1% numeric proximity.
- · Dates get a looser year-level fallback.
- · Prose-level fabrication is not detected. This pass catches the numeric class only — by far the most costly class in financial extractions.
See methodology for the full algorithm, limitations, and planned embedding-based prose checker.
Complementary tools
Users of this tool often explore
Agent Skill Tester for Markets
Paste a SKILL.md definition + sample input + your Anthropic API key. See structured extraction, token cost, and latency — all in your browser. No signup, key never leaves the page.
Price-Blind Research Auditor
Paste a research prompt or agent context bundle. The auditor flags price numbers, directional words, and outcome-leaking phrases that cause LLMs to retroactively rationalize positions. Builds a price-blind research boundary.
Prompt Injection Tester
Red-team a finance agent against 24 documented prompt-injection attacks — direct override, role confusion, indirect injection via retrieved content, jailbreak patterns, tool-call hijack. BYO key; runs client-side against your live model.