The short answer

After Google I/O 2026, the Gemini finance lineup is a three-rung cost ladder, and Gemini 3.5 Flash sits at the top, not the bottom. On a general research loop the Token Cost Optimizer prices it at $118.80/month, about tied with Gemini 2.5 Pro, roughly 4x Gemini 2.5 Flash, and about 18x Gemini 2.5 Flash-Lite. The Flash name suggests budget; the price says frontier.

After Google I/O 2026, the Gemini lineup for finance work is a clean three-rung cost ladder, and Gemini 3.5 Flash (launched May 19) sits at the top of it, not the bottom. On a general finance research loop (12k in / 2k out per call, 5 calls per idea, 20 ideas/day) the Token Cost Optimizer prices Gemini 3.5 Flash at $0.0360/call and $118.80/month: essentially tied with Gemini 2.5 Pro ($115.50/mo), about 4x Gemini 2.5 Flash ($28.38/mo), and ~18x Gemini 2.5 Flash-Lite ($6.60/mo). The "Flash" name suggests budget; the price says frontier. Every number below is recomputed live from the shipped engine bundle.

TL;DR: the 2026 Gemini cost ladder

Rung Model $/Mtok in $/Mtok out Cost / idea Cost / month
Budget floor Gemini 2.5 Flash-Lite $0.10 $0.40 $0.0110 $6.60
Fast economy Gemini 2.5 Flash $0.30 $2.50 $0.0473 $28.38
Frontier (latency) Gemini 3.5 Flash $1.50 $9.00 $0.1980 $118.80
Frontier (context) Gemini 2.5 Pro $1.25 $10.00 $0.1925 $115.50

Same loop for every row: 12,000 input + 2,000 output tokens per call, 5 calls per idea, 10% retry, 20 ideas/day, 0.25 validation rate, 0.50 cache-hit. Costs are the engine's own output on each model's verified list rate, not a benchmark run.

The three rungs

Rung 1, budget floor: Gemini 2.5 Flash-Lite ($0.10 / $0.40). The cheapest tier in the lineup, $6.60/mo on this loop. A 1M context window means it still fits a full 10-K. This is the model for high-volume extraction, classification, routing, and any step where the work is structural rather than reasoning-heavy.

Rung 2, fast economy: Gemini 2.5 Flash ($0.30 / $2.50). $28.38/mo, ~4.3x Flash-Lite. The default workhorse for mid-weight tasks: summarization, light synthesis, multi-document comparison that does not need frontier judgment.

Rung 3, frontier: Gemini 3.5 Flash ($1.50 / $9.00) and Gemini 2.5 Pro ($1.25 / $10.00). $118.80/mo and $115.50/mo respectively, within 3% of each other and ~18x the budget floor. Gemini 3.5 Flash buys agent-tier reasoning at Flash latency; Gemini 2.5 Pro buys a 2M context window. The price barely separates them; the choice is latency-vs-context.

Where Gemini 3.5 Flash actually fits

The mistake the name invites is treating Gemini 3.5 Flash as a drop-in upgrade for Gemini 2.5 Flash. It is not a fast-economy model: its output rate ($9.00) is ~3.6x Gemini 2.5 Flash's ($2.50) and the loop cost reflects that ($118.80 vs $28.38/mo, a 4.2x jump). Gemini 3.5 Flash earns its place when the task genuinely needs frontier reasoning and you cannot accept Pro-tier latency. For anything below that bar, you are paying a frontier premium for economy-tier work.

The cleanest pattern is to tier the loop: run the bulk of the calls on Flash-Lite or Flash, and route only the steps that need agent-tier judgment to Gemini 3.5 Flash. On the loop above, a mostly-economy loop with a thin frontier layer lands far closer to the $6-28/mo rungs than to the $119/mo rung.

Google's launch claim, stated as Google's claim

At I/O, Google said Gemini 3.5 Flash beats Gemini 3.1 Pro on coding and agentic benchmarks. That is a vendor benchmark. There was no independent third-party finance-task eval at launch, and none was run here. This article ranks the models on cost (computed from verified list prices) and says nothing about which reads a filing or reasons over a thesis most accurately. Confirm capability on your own task; the ladder here only tells you what each rung costs.

Decision guidance

  1. Start at the bottom. Default a finance step to Flash-Lite; promote it up a rung only when an eval shows the cheaper rung misses.
  2. Treat Gemini 3.5 Flash as a frontier choice, not an economy upgrade. Budget ~$119/mo for the loop above; scale linearly with calls and output tokens.
  3. Pick the top rung on latency vs context. Gemini 3.5 Flash for Flash latency, Gemini 2.5 Pro for the 2M window. Cost is a near-tie.
  4. Output tokens drive the bill at the top rung. Tighter prompts and terse outputs cut the dominant cost term.

Connects to

References

Verified engine output

Show the recompute-verified inputs and outputs
Finance research loop — Gemini 2.5 Flash-Lite (budget floor)
Inputs
input_tokens_per_call12000
output_tokens_per_call2000
calls_per_idea5
retry_rate0.1
ideas_per_day20
validation_rate0.25
cache_hit_rate0.5
model_idgemini-2-5-flash-lite
Result
model › idgemini-2-5-flash-lite
model › providergoogle
model › nameGemini 2.5 Flash-Lite
model › input usd per mtoken0.1
model › output usd per mtoken0.4
model › context window1000000
model › notesCheapest tier in this table; 1M context.
effective cost per call0.002
cost per idea0.011
cost per validated trade0.044
cost per day0.21999999999999997
cost per month6.6
cost per year80.3

Computed live at build time.

Same loop — Gemini 2.5 Flash (fast economy)
Inputs
input_tokens_per_call12000
output_tokens_per_call2000
calls_per_idea5
retry_rate0.1
ideas_per_day20
validation_rate0.25
cache_hit_rate0.5
model_idgemini-2-5-flash
Result
model › idgemini-2-5-flash
model › providergoogle
model › nameGemini 2.5 Flash
model › input usd per mtoken0.3
model › output usd per mtoken2.5
model › context window1000000
model › notesFast mid-tier; 1M context.
effective cost per call0.0086
cost per idea0.0473
cost per validated trade0.1892
cost per day0.9460000000000001
cost per month28.380000000000003
cost per year345.29

Computed live at build time.

Same loop — Gemini 3.5 Flash (frontier, Flash latency)
Inputs
input_tokens_per_call12000
output_tokens_per_call2000
calls_per_idea5
retry_rate0.1
ideas_per_day20
validation_rate0.25
cache_hit_rate0.5
model_idgemini-3-5-flash
Result
model › idgemini-3-5-flash
model › providergoogle
model › nameGemini 3.5 Flash
model › input usd per mtoken1.5
model › output usd per mtoken9
model › context window1000000
model › notesFrontier agent-tier at Flash speed — not a budget model (output ~3.6x Gemini 2.5 Flash).
effective cost per call0.036000000000000004
cost per idea0.198
cost per validated trade0.792
cost per day3.96
cost per month118.8
cost per year1445.4

Computed live at build time.

Same loop — Gemini 2.5 Pro (frontier, 2M context)
Inputs
input_tokens_per_call12000
output_tokens_per_call2000
calls_per_idea5
retry_rate0.1
ideas_per_day20
validation_rate0.25
cache_hit_rate0.5
model_idgemini-2-5-pro
Result
model › idgemini-2-5-pro
model › providergoogle
model › nameGemini 2.5 Pro
model › input usd per mtoken1.25
model › output usd per mtoken10
model › context window2000000
model › notesLarge context (2M). Strong on document analysis.
effective cost per call0.035
cost per idea0.1925
cost per validated trade0.77
cost per day3.85
cost per month115.5
cost per year1405.25

Computed live at build time.

Frequently asked questions

What is the cheapest Gemini model for finance in 2026?
Gemini 2.5 Flash-Lite ($0.10/$0.40 per Mtok), at $6.60 per month on the research loop here — the budget floor of the lineup, with a 1M context window. Gemini 2.5 Flash ($28.38 per month) is the next rung up.
Is Gemini 3.5 Flash the budget option?
No. On this loop it costs $118.80 per month — about 18x Gemini 2.5 Flash-Lite and 4x Gemini 2.5 Flash. It is a frontier agent-tier model at Flash latency, priced like Gemini 2.5 Pro, not like the economy tier.
Gemini 3.5 Flash or Gemini 2.5 Pro?
They cost within 3% on the same loop ($118.80 vs $115.50 per month). Choose Gemini 3.5 Flash for Flash latency, Gemini 2.5 Pro for the 2M context window. Price barely decides it.
Is Google's benchmark claim for 3.5 Flash verified?
No. It is Google's own launch claim on coding and agentic benchmarks, not an independent finance-task eval, and none was run here. Confirm capability on your own task.
Where do these Gemini costs come from?
Each model's verified 2026-05-25 list rate, run through the Token Cost Optimizer on one fixed research loop. Recomputed from the shipped bundle, not a benchmark run.
What's the cheapest Gemini for a finance research loop under $10/month?
Gemini 2.5 Flash-Lite, at $6.60 per month on this loop, is the only rung under $10. The next rung, Gemini 2.5 Flash, is $28.38 per month — already over. Frontier rungs (Gemini 3.5 Flash $118.80, Gemini 2.5 Pro $115.50) are an order of magnitude above a $10 budget.
Which Gemini tier should I use for high-volume SEC filing extraction?
Gemini 2.5 Flash-Lite. It is the budget floor ($6.60 per month on this loop) and its 1M context still fits a full 10-K, so structural extraction work does not need a higher rung. Promote a step to Gemini 2.5 Flash ($28.38 per month) or a frontier rung only when an eval shows Flash-Lite actually misses on that step.
Is Gemini 3.5 Flash worth it over Gemini 2.5 Flash for mid-weight summarization?
Usually not. On this loop 3.5 Flash is $118.80 per month against $28.38 for 2.5 Flash — a 4x jump for the same token shape — and summarization rarely needs frontier reasoning. Reserve 3.5 Flash for steps that genuinely need agent-tier judgment at Flash latency; keep summarization on 2.5 Flash.