The short answer

For financial analysis in 2026, the three-way pick among Claude, GPT-5, and Gemini comes down to tier, not vendor. Gemini 2.5 Flash-Lite is the cheapest extraction tier (about $0.01 per 100k-token 10-K), Claude Sonnet 4.6 and GPT-5.4 sit mid-tier ($0.31 to $0.36 per filing), and Opus 4.7 and GPT-5.5 top reasoning ($0.60 to $0.62).

For financial analysis in 2026, the three-way pick among Claude, GPT-5, and Gemini comes down to tier, not vendor: Gemini 2.5 Flash-Lite is the cheapest extraction tier at roughly $0.01 to process a 100k-token 10-K (with Gemini 2.5 Flash close behind at $0.04), Claude Sonnet 4.6 and GPT-5.4 sit in the mid reasoning tier ($0.31–$0.36 per filing), and Claude Opus 4.7 and GPT-5.5 are the top reasoning tier ($0.60–$0.62 per filing). Verified list prices below. The frontier vendors are close enough on price that the decision is workload-tier first, vendor-relationship second. Match the tier to the task with the Model Selector for Finance.

TL;DR

Model Input $/Mtok Output $/Mtok Context ~$/10-K (100k in + 4k out)
Claude Opus 4.7 $5.00 $25.00 1M $0.60
Claude Sonnet 4.6 $3.00 $15.00 1M $0.36
GPT-5.5 $5.00 $30.00 long-context tiered $0.62
GPT-5.4 $2.50 $15.00 long-context tiered $0.31
Gemini 3.5 Flash $1.50 $9.00 1M $0.19
Gemini 2.5 Pro $1.25 (≤200k) $10.00 (≤200k) 1M+ $0.17
Gemini 2.5 Flash $0.30 $2.50 1M $0.04
Gemini 2.5 Flash-Lite $0.10 $0.40 1M $0.01

Per-filing costs are computed from the verified list prices in the table (input tokens × input rate + output tokens × output rate), not from a benchmark run. All prices verified 2026-05-25 against each vendor's official pricing page.

The verified 2026 list prices

Anthropic (Claude). Opus 4.7 is $5 / Mtok input and $25 / Mtok output; Sonnet 4.6 is $3 / $15; Haiku 4.5 is $1 / $5. Opus 4.7 and Sonnet 4.6 include the full 1M-token context window at standard rates. Prompt-cache reads cost 0.1x base input (a 90% discount); the Batch API is 50% off.1

OpenAI (GPT-5 family). GPT-5.5 is $5 / Mtok input ($0.50 cached) and $30 / Mtok output; GPT-5.4 is $2.50 / $15 ($0.25 cached); GPT-5.4-mini is $0.75 / $4.50. Cached input is a 90% discount on the GPT-5.5 and GPT-5.4 tiers. OpenAI lists separate short- and long-context rates.2

Google (Gemini). Gemini 2.5 Pro is $1.25 / Mtok input and $10 / Mtok output for prompts ≤200k tokens, rising to $2.50 / $15 above 200k; Gemini 3.5 Flash is $1.50 / $9 (a frontier agent-tier despite the "Flash" name — its output rate is ~3.6x Gemini 2.5 Flash); Gemini 2.5 Flash is $0.30 / $2.50; Gemini 2.5 Flash-Lite is $0.10 / $0.40, the cheapest tier. Context-cache reads are billed at roughly 10% of base input.3

What the prices tell you

Two structural facts dominate.

First, the tier gap dwarfs the vendor gap. Within the top reasoning tier, Opus 4.7 ($0.60/filing) and GPT-5.5 ($0.62/filing) are within 4% of each other. Within the mid tier, Sonnet 4.6 ($0.36) and GPT-5.4 ($0.31) are within 14%. The decision between "frontier reasoning" and "mid reasoning" moves cost ~2x; the decision between Anthropic and OpenAI inside a tier barely moves it. Pick the tier your task needs, then pick the vendor on factors price does not capture.

Second, Gemini wins the budget extraction lane outright. At $0.30 / $2.50, Flash processes a 10-K for ~$0.04: an order of magnitude under the mid tier and ~15x under the top tier. For high-volume field extraction where the task does not need frontier reasoning, that gap is the whole argument.

Where each model is the right pick for finance

  • Long-context synthesis over a full 10-K or merged filing set Claude Opus 4.7 or Gemini 2.5 Pro. Opus carries 1M context at a flat rate; Gemini 2.5 Pro offers the largest published windows and the cheapest frontier input rate, with the ≤200k vs >200k price step to plan around.
  • Mid-tier reasoning at scale (ratio analysis, multi-document comparison): Sonnet 4.6 or GPT-5.4. Near-identical cost; choose on existing vendor relationship and eval results.
  • High-volume extraction (pull every numeric field from thousands of filings): Gemini 2.5 Flash. The $0.04/filing rate makes a market-wide sweep affordable; reserve a reasoning tier for the subset that needs judgment.
  • Cost-sensitive filtering layer Claude Haiku 4.5 ($1/$5) or Gemini 2.5 Flash, in front of a frontier model that only sees the filtered subset.

Caching changes the math for repeated context

If your prompt reuses a large fixed block (a system prompt, a filing-structure schema, a shared instruction set) prompt caching reshapes the per-call cost. Anthropic cache reads are 0.1x base input; OpenAI cached input is a 90% discount on GPT-5.5/5.4; Gemini context-cache reads are ~10% of base input.123 For a 50k-token fixed preamble reused across many filings, the cached path can dominate the uncached path after the first few reuses. The OpenAI Prompt Caching Pricing 2026 page works the break-even math in detail.

Verified engine output

The block below runs the Model Selector for Finance on a high-quality, large-context synthesis profile (the hardest finance workload: full-filing reasoning, sub-30s latency, top budget). It returns the tier-appropriate frontier pick. The selector carries its own internal reference rate table, refreshed to the same verified rates as the table above: it prices Claude Opus 4.7 at $5/$25, so the Opus monthlyBudgetEstimate it prints (~$180 at the tool's default reference workload) tracks the verified list price. Opus 4.7, Gemini 2.5 Pro, and Gemini 3.5 Flash tie on combined tier-fit at the top; Opus ranks first, and GPT-5.5 is disqualified here on the context gate (its 400K window falls outside the 200K-1M band) despite a near-identical ~$198/mo reference spend. The engine output is computed live from the shipped bundle, not typed by hand.

Decision guidance

  1. Classify the task tier extraction, mid reasoning, or frontier reasoning. This is the dominant cost lever.
  2. Pick the cheapest model in that tier that clears your accuracy bar on your own documents.
  3. Run a small eval to break vendor ties inside a tier; the in-tier price gap is too small to decide on price alone.
  4. Add caching if a large fixed context repeats across calls.

Connects to

References

Footnotes

  1. Anthropic. "Pricing." platform.claude.com, verified 2026-05-25. https://platform.claude.com/docs/en/about-claude/pricing 2

  2. OpenAI. "API Pricing." developers.openai.com, verified 2026-05-25. https://developers.openai.com/api/docs/pricing 2

  3. Google. "Gemini Developer API pricing." ai.google.dev, verified 2026-05-25. https://ai.google.dev/gemini-api/docs/pricing 2

Verified engine output

Show the recompute-verified inputs and outputs
Frontier synthesis tier: full-filing reasoning, sub-30s, top budget
Inputs
tasksynthesize
latencysub_30s
costb200_plus
contextk200_1m
qualityhigh
Result
ranked (10 items)[...]

Computed live at build time.

Frequently asked questions

Which is cheapest for financial analysis: Claude, GPT-5, or Gemini?
Gemini 2.5 Flash-Lite at $0.10/$0.40 per Mtok is the cheapest tier — roughly $0.01 to process a 100k-token 10-K, with Gemini 2.5 Flash close behind at ~$0.04. Both are an order of magnitude under the mid tier.
Is Claude or GPT-5 better value at the top tier?
They are within 4% on cost (Opus 4.7 ~$0.60 per filing, GPT-5.5 ~$0.62 per filing). The tier decision matters far more than the vendor decision; break the tie on an eval, not on price.
Do all three support a full 10-K in context?
Yes. Claude Opus 4.7 and Sonnet 4.6 carry 1M context at standard rates; Gemini 2.5 Pro offers 1M or more; GPT-5 uses tiered short and long context pricing. A 100k-token 10-K fits all three.
How much does caching save?
Anthropic and Gemini cache reads cost about 10% of base input; OpenAI cached input is a 90% discount. For a large fixed preamble reused across calls, caching can dominate after a few reuses.
Where do these prices come from?
Each vendor's official pricing page, verified 2026-05-25. Per-filing costs are computed from those list prices, not from a benchmark run.