Skip to main content
aifinhub

Worked example

Running the shipped structured-schema-validator-finance engine on the input below produces exactly this output. Continuous integration recomputes it against the engine bundle on every build, so these numbers cannot drift from the code.

Input

{
  "tool": "structured_schema_validator_finance",
  "schema_id": "trade_decision",
  "json": {
    "ticker": "AAPL",
    "side": "long",
    "size_shares": 250,
    "rationale": "Earnings beat confirmed the margin-expansion thesis; sized for 1% portfolio risk against the declared stop.",
    "stop_loss": 178.5,
    "take_profit": 212
  }
}

Output

{
  "pass": true,
  "fields": [
    {
      "field": "ticker",
      "status": "ok",
      "message": "ok"
    },
    {
      "field": "side",
      "status": "ok",
      "message": "ok"
    },
    {
      "field": "size_shares",
      "status": "ok",
      "message": "ok"
    },
    {
      "field": "rationale",
      "status": "ok",
      "message": "ok"
    },
    {
      "field": "stop_loss",
      "status": "ok",
      "message": "ok"
    },
    {
      "field": "take_profit",
      "status": "ok",
      "message": "ok"
    }
  ],
  "sanityFlags": [
    "ticker \"AAPL\" looks like a real issuer — aifinhub tools use SYNTHETIC_A / SYNTHETIC_B placeholders"
  ]
}

Frequently asked questions

What does the Structured Schema Validator for Finance methodology page document?
Why structured schemas beat free-form LLM output in finance workflows, and how the four built-in schemas and sanity bands are chosen. It states the formulas, assumptions, data sources, limitations, and reproducibility steps behind the Structured Schema Validator for Finance, in the Finance category.
When was the Structured Schema Validator for Finance methodology last reviewed?
This methodology was last reviewed on 2026-04-23. The matching tool is at https://aifinhub.io/structured-schema-validator-finance/.
Are the Structured Schema Validator for Finance numbers reproducible?
Yes. This page embeds a worked example whose output is the verbatim result of running the shipped structured-schema-validator-finance engine on a fixed input; the embedded JSON is recomputed and diffed against the engine in CI, so the numbers cannot drift from the code.

Methodology · Tool · Last updated 2026-04-23

How Structured Schema Validator for Finance works

How the Structured Schema Validator checks LLM-produced JSON against four canonical finance output schemas, and why the schemas are shaped the way they are.

Why structured schemas beat free-form output

A paragraph of prose from an LLM is unfalsifiable by construction: every downstream consumer — your execution layer, your risk system, your teammate — has to re-parse it, infer what the model "meant," and silently tolerate the cases where the extraction fails. In a finance workflow that translates directly into slippage (bad trades sent because the parser guessed), missed risk (an unclear stop interpreted as no stop), and untracked hallucinations (numbers in the narrative that never survive to the monitoring layer).

A schema flips that default. The model is contractually required to emit fields your systems actually need, the parser either succeeds or raises, and every deviation becomes an auditable event. The usual objection — "but the LLM will sometimes fail to produce valid JSON" — is exactly the point: when the producer fails, the consumer must not pretend everything is fine. You want the loud failure.

This tool does not enforce the schema on the model side. That belongs in the prompt and in the constrained-decoding / tool-call configuration. What it does is give you a stable client-side double-check: paste any JSON your model returned and see, per-field, whether the contract held.

The four schemas

Research Output

A research note is only useful if it can be proven wrong later. The schema forces three things: a numeric probability (so you can track calibration over time), a thesis with at least 50 characters (to reject one-line narrative dumps), and invalidation_conditions that are explicit events — not moods. Conviction is a low / medium / high enum so you cannot smuggle extra buckets past a downstream aggregator, and source_citations stops the model from asserting facts without pointers.

Trade Decision

Any LLM-driven execution layer should refuse to submit orders on free-form output — this schema is the minimum contract. side is an enum of long / short / none so a typoed "buy" does not accidentally short. size_shares is positive and plausibility-bounded (the validator flags absurd share counts; it does not block them, because that is a business-layer concern). stop_loss and take_profit are cross-checked for the common LLM error of swapping them on short ideas. Examples use SYNTHETIC_A / SYNTHETIC_B placeholders because aifinhub tools never publish trade hints on real issuers.

Risk Snapshot

Portfolio-level risk is where unit errors are most expensive. The schema requires net_exposure_usd to carry its unit in the field name — a convention that eliminates an entire class of bugs — and a full greek object with delta, gamma, theta, vega so partial dumps do not slip through. max_drawdown_pct is hard-capped at 100 because any larger value indicates either a unit mix (fraction vs percent) or a definitional error in the producer.

Peer Comparison

Peer sets are where LLMs most often produce confident-sounding mush. The schema forces the model to name at least two peers and at least two metrics that were actually compared, and to write a 100+ character narrative synthesizing the differences. Duplicates between the target and the peer list trigger a sanity flag — a surprisingly common failure mode.

Sanity-check approach

The validator distinguishes between hard validation (type, required, enum, numeric min/max inside the schema itself) and soft sanity bands (plausibility ranges for values that are technically valid but obviously wrong in finance context). Soft bands never fail validation — they only emit warnings.

The bands are drawn from public real-world distributions rather than hand-picked: retail leverage ratios above 4× land outside Reg T norms and therefore warrant a warning; net exposures above 100M USD signal an institutional-scale book and therefore prompt a unit-confirmation check; EPS values outside [-200, 200] almost always indicate a unit mix (cents vs dollars) rather than a legitimate edge case. The heuristic for revenue / earnings / EPS / EBITDA fields warns when the field name has no embedded unit — a convention borrowed from well-run financial data pipelines that eliminates whole categories of downstream bugs.

None of this is a substitute for domain review. What it gets you is a first-pass filter that catches the obvious mistakes before they reach a human reviewer who has better things to do than re-read a model's JSON.

What the tool does not do

  • No live pricing or market data — the validator is 100% client-side and does not call any API.
  • No fact-checking. A probability of 0.6 on a made-up thesis passes structural validation. Sanity bands check plausibility of magnitudes, not truth of claims.
  • No schema generation. The four schemas are opinionated, curated starting points — not a general-purpose schema DSL.
  • No investment advice. This is a developer tool for LLM pipelines; any trade implied by the example JSON is illustrative.

Related articles

External resources

Changelog

  • 2026-04-23 — Initial release. Four schemas (Research Output, Trade Decision, Risk Snapshot, Peer Comparison). Hard validation + soft sanity bands. 100% client-side.
Planning estimates only — not financial, tax, or investment advice.