What does the Price-Blind Research Auditor methodology page document?

Rule families, severity weighting, leakage score, and limitations for the Price-Blind Research Auditor. Source citations, assumption deltas, and as-of dates It states the formulas, assumptions, data sources, limitations, and reproducibility steps behind the Price-Blind Research Auditor, in the Finance category.

When was the Price-Blind Research Auditor methodology last reviewed?

This methodology was last reviewed on 2026-04-20. The matching tool is at https://aifinhub.io/price-blind-auditor/.

Are the Price-Blind Research Auditor numbers reproducible?

Yes. This page embeds a worked example whose output is the verbatim result of running the shipped price-blind-auditor engine on a fixed input; the embedded JSON is recomputed and diffed against the engine in CI, so the numbers cannot drift from the code.

Methodology: Price-Blind Research Auditor

Why this exists

When a large language model sees current prices, directional language, or position state while generating a trade thesis, it produces an argument that rationalises whichever direction the data implies. Agents with this contamination are systematically biased toward confirming existing positions. The fix is an architectural one: the LLM operates on price-blind context; only the risk engine, after the thesis is produced, reconciles the view with market state.

This tool is the lint layer for that boundary. It does not "fix" contamination; it reveals it so a human can rewrite the prompt or redact the retrieved context before handing the bundle to the model.

Rule families

The ruleset targets four classes of leakage:

1. Explicit prices

Ticker–price pairs: SYNTHETIC_A 451.20, BTC $67,400
Dollar-denominated numbers: $451.20
Bid / ask / mid / last quotes: bid: 451.18, ask: 451.22

2. Directional framing

Percentage-move verbs: up 4.7%, dropped 2.1%
Standalone directional verbs: rallying, dumping, ripping, crashing
New-high / new-low language
After-the-fact framing: after the rally, following the crash
Chart-pattern labels: descending triangle, head and shoulders
Recency + price: this morning + numeric context

3. Position state / P&L leakage

Position language: open position, held since, long from
Unrealised / realised P&L, mark-to-market values
Stop-loss / take-profit / target-price metadata

4. Sentiment anchors

bullish, bearish, hawkish, dovish, risk-on, risk-off

Scoring

Each match is weighted by severity:

high   → 1.0
medium → 0.5
low    → 0.2

The per-line leakage score is leakage = min(1, total_weight / 10), saturating at around ten high-severity matches. Verdict bands:

0: clean
< 0.2: light
0.2 – 0.5: caution
> 0.5: contaminated

False-positive discipline

The rules are deliberately conservative — the cost of a false positive is that a developer re-reads a line; the cost of a false negative is a silently contaminated agent. Legitimate-but-flagged content (e.g. discussing historical regimes) should be annotated as a fixture the auditor is expected to catch, then wrapped or redacted before use.

Limitations

Regex, not semantic. The tool catches structural leakage (numbers, verbs, metadata labels). A semantically-rich paraphrase like "today was a very good day for the stock" would slip through.
Language: English only. German/other-language prompts are not covered in this release.
Context window unaware. The tool does not check whether flagged language appears inside a quoted user utterance that the model is expected to treat as data. Treat all flags as review items.
No redaction. Output is a diagnostic report, not a rewritten bundle. Redaction is a deliberate design choice — automated rewriting would strip nuance.
No image / audio scanning. Multi-modal inputs (screenshots, charts, voice notes) are out of scope.

Privacy

All pattern matching runs in the browser. Nothing is uploaded. No cookies, no third-party trackers. Refresh the page and the pasted bundle is gone.

References

Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). "On the Dangers of Stochastic Parrots." Proceedings of FAccT '21.
Lin, S., Hilton, J., & Evans, O. (2022). "TruthfulQA: Measuring How Models Mimic Human Falsehoods." ACL 2022.
Shi, F. et al. (2023). "Large Language Models Can Be Easily Distracted by Irrelevant Context." ICML 2023.
Anthropic (2025). "Constitutional AI and the Separation of Context from Instruction." Technical note.

Connects to

Prompt Regression Tester — after cleaning, regression-test the clean prompt across providers.
Hallucination Detector — a clean prompt doesn't prevent fabrication; check the output too.
Trading System Blueprinter — the price-blind boundary is a load-bearing architectural choice.

External resources

Bias and Concentration in Machine Learning Trading (Lopez de Prado 2018, arXiv)

Changelog

2026-04-20 — Initial release with 14 rules across 4 families, severity weighting, and saturating leakage score.

Worked example

Frequently asked questions

How Price-Blind Research Auditor works