The short answer
For financial analysis in 2026, the three-way pick among Claude, GPT-5, and Gemini comes down to tier, not vendor. Gemini 2.5 Flash-Lite is the cheapest extraction tier (about $0.01 per 100k-token 10-K), Claude Sonnet 4.6 and GPT-5.4 sit mid-tier ($0.31 to $0.36 per filing), and Opus 4.7 and GPT-5.5 top reasoning ($0.60 to $0.62).
For financial analysis in 2026, the three-way pick among Claude, GPT-5, and Gemini comes down to tier, not vendor: Gemini 2.5 Flash-Lite is the cheapest extraction tier at roughly $0.01 to process a 100k-token 10-K (with Gemini 2.5 Flash close behind at $0.60–$0.62 per filing). Verified list prices below. The frontier vendors are close enough on price that the decision is workload-tier first, vendor-relationship second. Match the tier to the task with the Model Selector for Finance.$0.04), Claude Sonnet 4.6 and GPT-5.4 sit in the mid reasoning tier ($0.31–$0.36 per filing), and Claude Opus 4.7 and GPT-5.5 are the top reasoning tier (
TL;DR
| Model | Input $/Mtok | Output $/Mtok | Context | ~$/10-K (100k in + 4k out) |
|---|---|---|---|---|
| Claude Opus 4.7 | $5.00 | $25.00 | 1M | $0.60 |
| Claude Sonnet 4.6 | $3.00 | $15.00 | 1M | $0.36 |
| GPT-5.5 | $5.00 | $30.00 | long-context tiered | $0.62 |
| GPT-5.4 | $2.50 | $15.00 | long-context tiered | $0.31 |
| Gemini 3.5 Flash | $1.50 | $9.00 | 1M | $0.19 |
| Gemini 2.5 Pro | $1.25 (≤200k) | $10.00 (≤200k) | 1M+ | $0.17 |
| Gemini 2.5 Flash | $0.30 | $2.50 | 1M | $0.04 |
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | 1M | $0.01 |
Per-filing costs are computed from the verified list prices in the table (input tokens × input rate + output tokens × output rate), not from a benchmark run. All prices verified 2026-05-25 against each vendor's official pricing page.
The verified 2026 list prices
Anthropic (Claude). Opus 4.7 is $5 / Mtok input and $25 / Mtok output; Sonnet 4.6 is $3 / $15; Haiku 4.5 is $1 / $5. Opus 4.7 and Sonnet 4.6 include the full 1M-token context window at standard rates. Prompt-cache reads cost 0.1x base input (a 90% discount); the Batch API is 50% off.1
OpenAI (GPT-5 family). GPT-5.5 is $5 / Mtok input ($0.50 cached) and $30 / Mtok output; GPT-5.4 is $2.50 / $15 ($0.25 cached); GPT-5.4-mini is $0.75 / $4.50. Cached input is a 90% discount on the GPT-5.5 and GPT-5.4 tiers. OpenAI lists separate short- and long-context rates.2
Google (Gemini). Gemini 2.5 Pro is $1.25 / Mtok input and $10 / Mtok output for prompts ≤200k tokens, rising to $2.50 / $15 above 200k; Gemini 3.5 Flash is $1.50 / $9 (a frontier agent-tier despite the "Flash" name — its output rate is ~3.6x Gemini 2.5 Flash); Gemini 2.5 Flash is $0.30 / $2.50; Gemini 2.5 Flash-Lite is $0.10 / $0.40, the cheapest tier. Context-cache reads are billed at roughly 10% of base input.3
What the prices tell you
Two structural facts dominate.
First, the tier gap dwarfs the vendor gap. Within the top reasoning tier, Opus 4.7 ($0.60/filing) and GPT-5.5 ($0.62/filing) are within 4% of each other. Within the mid tier, Sonnet 4.6 ($0.36) and GPT-5.4 ($0.31) are within 14%. The decision between "frontier reasoning" and "mid reasoning" moves cost ~2x; the decision between Anthropic and OpenAI inside a tier barely moves it. Pick the tier your task needs, then pick the vendor on factors price does not capture.
Second, Gemini wins the budget extraction lane outright. At $0.30 / $2.50, Flash processes a 10-K for ~$0.04: an order of magnitude under the mid tier and ~15x under the top tier. For high-volume field extraction where the task does not need frontier reasoning, that gap is the whole argument.
Where each model is the right pick for finance
- Long-context synthesis over a full 10-K or merged filing set Claude Opus 4.7 or Gemini 2.5 Pro. Opus carries 1M context at a flat rate; Gemini 2.5 Pro offers the largest published windows and the cheapest frontier input rate, with the ≤200k vs >200k price step to plan around.
- Mid-tier reasoning at scale (ratio analysis, multi-document comparison): Sonnet 4.6 or GPT-5.4. Near-identical cost; choose on existing vendor relationship and eval results.
- High-volume extraction (pull every numeric field from thousands of filings): Gemini 2.5 Flash. The $0.04/filing rate makes a market-wide sweep affordable; reserve a reasoning tier for the subset that needs judgment.
- Cost-sensitive filtering layer Claude Haiku 4.5 ($1/$5) or Gemini 2.5 Flash, in front of a frontier model that only sees the filtered subset.
Caching changes the math for repeated context
If your prompt reuses a large fixed block (a system prompt, a filing-structure schema, a shared instruction set) prompt caching reshapes the per-call cost. Anthropic cache reads are 0.1x base input; OpenAI cached input is a 90% discount on GPT-5.5/5.4; Gemini context-cache reads are ~10% of base input.123 For a 50k-token fixed preamble reused across many filings, the cached path can dominate the uncached path after the first few reuses. The OpenAI Prompt Caching Pricing 2026 page works the break-even math in detail.
Verified engine output
The block below runs the Model Selector for Finance on a high-quality, large-context synthesis profile (the hardest finance workload: full-filing reasoning, sub-30s latency, top budget). It returns the tier-appropriate frontier pick. The selector carries its own internal reference rate table, refreshed to the same verified rates as the table above: it prices Claude Opus 4.7 at $5/$25, so the Opus monthlyBudgetEstimate it prints (~$180 at the tool's default reference workload) tracks the verified list price. Opus 4.7, Gemini 2.5 Pro, and Gemini 3.5 Flash tie on combined tier-fit at the top; Opus ranks first, and GPT-5.5 is disqualified here on the context gate (its 400K window falls outside the 200K-1M band) despite a near-identical ~$198/mo reference spend. The engine output is computed live from the shipped bundle, not typed by hand.
Decision guidance
- Classify the task tier extraction, mid reasoning, or frontier reasoning. This is the dominant cost lever.
- Pick the cheapest model in that tier that clears your accuracy bar on your own documents.
- Run a small eval to break vendor ties inside a tier; the in-tier price gap is too small to decide on price alone.
- Add caching if a large fixed context repeats across calls.
Connects to
- Model Selector for Finance: the tier-fit engine behind the recommendation.
- Best LLM for Financial Analysis 2026: the task-tiered pillar.
- Cheapest LLM for SEC Filings 2026: the budget-extraction deep dive.
- OpenAI Prompt Caching Pricing 2026: the caching break-even math.
References
Footnotes
-
Anthropic. "Pricing." platform.claude.com, verified 2026-05-25. https://platform.claude.com/docs/en/about-claude/pricing ↩ ↩2
-
OpenAI. "API Pricing." developers.openai.com, verified 2026-05-25. https://developers.openai.com/api/docs/pricing ↩ ↩2
-
Google. "Gemini Developer API pricing." ai.google.dev, verified 2026-05-25. https://ai.google.dev/gemini-api/docs/pricing ↩ ↩2
Verified engine output
Show the recompute-verified inputs and outputs
| task | synthesize |
|---|---|
| latency | sub_30s |
| cost | b200_plus |
| context | k200_1m |
| quality | high |
| ranked (10 items) | [...] |
|---|
Computed live at build time.
Frequently asked questions
- Which is cheapest for financial analysis: Claude, GPT-5, or Gemini?
- Gemini 2.5 Flash-Lite at $0.10/$0.40 per Mtok is the cheapest tier — roughly $0.01 to process a 100k-token 10-K, with Gemini 2.5 Flash close behind at ~$0.04. Both are an order of magnitude under the mid tier.
- Is Claude or GPT-5 better value at the top tier?
- They are within 4% on cost (Opus 4.7 ~$0.60 per filing, GPT-5.5 ~$0.62 per filing). The tier decision matters far more than the vendor decision; break the tie on an eval, not on price.
- Do all three support a full 10-K in context?
- Yes. Claude Opus 4.7 and Sonnet 4.6 carry 1M context at standard rates; Gemini 2.5 Pro offers 1M or more; GPT-5 uses tiered short and long context pricing. A 100k-token 10-K fits all three.
- How much does caching save?
- Anthropic and Gemini cache reads cost about 10% of base input; OpenAI cached input is a 90% discount. For a large fixed preamble reused across calls, caching can dominate after a few reuses.
- Where do these prices come from?
- Each vendor's official pricing page, verified 2026-05-25. Per-filing costs are computed from those list prices, not from a benchmark run.