What is the cheapest Gemini model for finance in 2026?

Gemini 2.5 Flash-Lite ($0.10/$0.40 per Mtok), at $6.60 per month on the research loop here — the budget floor of the lineup, with a 1M context window. Gemini 2.5 Flash ($28.38 per month) is the next rung up.

Is Gemini 3.5 Flash the budget option?

No. On this loop it costs $118.80 per month — about 18x Gemini 2.5 Flash-Lite and 4x Gemini 2.5 Flash. It is a frontier agent-tier model at Flash latency, priced like Gemini 2.5 Pro, not like the economy tier.

Gemini 3.5 Flash or Gemini 2.5 Pro?

They cost within 3% on the same loop ($118.80 vs $115.50 per month). Choose Gemini 3.5 Flash for Flash latency, Gemini 2.5 Pro for the 1M context window. Price barely decides it.

Is Google's benchmark claim for 3.5 Flash verified?

No. It is Google's own launch claim on coding and agentic benchmarks, not an independent finance-task eval, and none was run here. Confirm capability on your own task.

Where do these Gemini costs come from?

Each model's verified 2026-05-25 list rate, run through the Token Cost Optimizer on one fixed research loop. Recomputed from the shipped bundle, not a benchmark run.

What's the cheapest Gemini for a finance research loop under $10/month?

Gemini 2.5 Flash-Lite, at $6.60 per month on this loop, is the only rung under $10. The next rung, Gemini 2.5 Flash, is $28.38 per month — already over. Frontier rungs (Gemini 3.5 Flash $118.80, Gemini 2.5 Pro $115.50) are an order of magnitude above a $10 budget.

Which Gemini tier should I use for high-volume SEC filing extraction?

Gemini 2.5 Flash-Lite. It is the budget floor ($6.60 per month on this loop) and its 1M context still fits a full 10-K, so structural extraction work does not need a higher rung. Promote a step to Gemini 2.5 Flash ($28.38 per month) or a frontier rung only when an eval shows Flash-Lite actually misses on that step.

Is Gemini 3.5 Flash worth it over Gemini 2.5 Flash for mid-weight summarization?

Usually not. On this loop 3.5 Flash is $118.80 per month against $28.38 for 2.5 Flash — a 4x jump for the same token shape — and summarization rarely needs frontier reasoning. Reserve 3.5 Flash for steps that genuinely need agent-tier judgment at Flash latency; keep summarization on 2.5 Flash.

The 2026 Gemini Cost Ladder for Finance: 3.5 Flash, Flash-Lite, 2.5 Pro

The short answer

After Google I/O 2026, the Gemini finance lineup is a three-rung cost ladder, and Gemini 3.5 Flash sits at the top, not the bottom. On a general research loop the Token Cost Optimizer prices it at $118.80/month, about tied with Gemini 2.5 Pro, roughly 4x Gemini 2.5 Flash, and about 18x Gemini 2.5 Flash-Lite. The Flash name suggests budget; the price says frontier.

After Google I/O 2026, the Gemini lineup for finance work is a clean three-rung cost ladder, and Gemini 3.5 Flash (launched May 19) sits at the top of it, not the bottom. On a general finance research loop (12k in / 2k out per call, 5 calls per idea, 20 ideas/day) the Token Cost Optimizer prices Gemini 3.5 Flash at $0.0360/call and $118.80/month: essentially tied with Gemini 2.5 Pro ($115.50/mo), about 4x Gemini 2.5 Flash ($28.38/mo), and ~18x Gemini 2.5 Flash-Lite ($6.60/mo). The "Flash" name suggests budget; the price says frontier. Every number below is recomputed live from the shipped engine bundle.

TL;DR: the 2026 Gemini cost ladder

Rung	Model	$/Mtok in	$/Mtok out	Cost / idea	Cost / month
Budget floor	Gemini 2.5 Flash-Lite	$0.10	$0.40	$0.0110	$6.60
Fast economy	Gemini 2.5 Flash	$0.30	$2.50	$0.0473	$28.38
Frontier (latency)	Gemini 3.5 Flash	$1.50	$9.00	$0.1980	$118.80
Frontier (context)	Gemini 2.5 Pro	$1.25	$10.00	$0.1925	$115.50

Same loop for every row: 12,000 input + 2,000 output tokens per call, 5 calls per idea, 10% retry, 20 ideas/day. Costs are the engine's own output on each model's verified list rate, not a benchmark run.

The three rungs

Rung 1, budget floor: Gemini 2.5 Flash-Lite ($0.10 / $0.40). The cheapest tier in the lineup, $6.60/mo on this loop. A 1M context window means it still fits a full 10-K. This is the model for high-volume extraction, classification, routing, and any step where the work is structural rather than reasoning-heavy.

Rung 2, fast economy: Gemini 2.5 Flash ($0.30 / $2.50). $28.38/mo, ~4.3x Flash-Lite. The default workhorse for mid-weight tasks: summarization, light synthesis, multi-document comparison that does not need frontier judgment.

Rung 3, frontier: Gemini 3.5 Flash ($1.50 / $9.00) and Gemini 2.5 Pro ($1.25 / $10.00). $118.80/mo and $115.50/mo respectively, within 3% of each other and ~18x the budget floor. Gemini 3.5 Flash buys agent-tier reasoning at Flash latency; Gemini 2.5 Pro buys a 1M context window. The price barely separates them; the choice is latency-vs-context.

Where Gemini 3.5 Flash actually fits

The mistake the name invites is treating Gemini 3.5 Flash as a drop-in upgrade for Gemini 2.5 Flash. It is not a fast-economy model: its output rate ($9.00) is ~3.6x Gemini 2.5 Flash's ($2.50) and the loop cost reflects that ($118.80 vs $28.38/mo, a 4.2x jump). Gemini 3.5 Flash earns its place when the task genuinely needs frontier reasoning and you cannot accept Pro-tier latency. For anything below that bar, you are paying a frontier premium for economy-tier work.

The cleanest pattern is to tier the loop: run the bulk of the calls on Flash-Lite or Flash, and route only the steps that need agent-tier judgment to Gemini 3.5 Flash. On the loop above, a mostly-economy loop with a thin frontier layer lands far closer to the $6-28/mo rungs than to the $119/mo rung.

Google's launch claim, stated as Google's claim

At I/O, Google said Gemini 3.5 Flash beats Gemini 3.1 Pro on coding and agentic benchmarks. That is a vendor benchmark. There was no independent third-party finance-task eval at launch, and none was run here. This article ranks the models on cost (computed from verified list prices) and says nothing about which reads a filing or reasons over a thesis most accurately. Confirm capability on your own task; the ladder here only tells you what each rung costs.

Decision guidance

Start at the bottom. Default a finance step to Flash-Lite; promote it up a rung only when an eval shows the cheaper rung misses.
Treat Gemini 3.5 Flash as a frontier choice, not an economy upgrade. Budget ~$119/mo for the loop above; scale linearly with calls and output tokens.
Pick the top rung on latency vs context. Gemini 3.5 Flash for Flash latency, Gemini 2.5 Pro for the 2M window. Cost is a near-tie.
Output tokens drive the bill at the top rung. Tighter prompts and terse outputs cut the dominant cost term.

Connects to

Token Cost Optimizer: the per-call cost engine behind every rung here.
Cheapest LLM for SEC Filings 2026: where the budget rung does its best work.
Claude vs GPT-5 vs Gemini for Financial Analysis 2026: the cross-vendor tier comparison.
Best LLM for Financial Analysis 2026: the task-tiered pillar that places each model.

References

Google. "Gemini Developer API pricing." ai.google.dev, verified 2026-05-25. https://ai.google.dev/gemini-api/docs/pricing
Anthropic. "Pricing." platform.claude.com, verified 2026-05-25. https://platform.claude.com/docs/en/about-claude/pricing

Verified engine output

Show the recompute-verified inputs and outputs

Finance research loop — Gemini 2.5 Flash-Lite (budget floor)

Inputs
input_tokens_per_call	12000
output_tokens_per_call	2000
calls_per_idea	5
retry_rate	0.1
ideas_per_day	20
model_id	gemini-2-5-flash-lite

Result
model › id	gemini-2-5-flash-lite
model › provider	google
model › name	Gemini 2.5 Flash-Lite
model › input usd per mtoken	0.1
model › output usd per mtoken	0.4
model › context window	1000000
model › notes	Cheapest tier in this table; 1M context.
effective cost per call	0.002
cost per idea	0.011
cost per validated trade	0.05499999999999999
cost per day	0.21999999999999997
cost per month	6.6
cost per year	80.3

Computed live at build time.

Same loop — Gemini 2.5 Flash (fast economy)

Inputs
input_tokens_per_call	12000
output_tokens_per_call	2000
calls_per_idea	5
retry_rate	0.1
ideas_per_day	20
model_id	gemini-2-5-flash

Result
model › id	gemini-2-5-flash
model › provider	google
model › name	Gemini 2.5 Flash
model › input usd per mtoken	0.3
model › output usd per mtoken	2.5
model › context window	1000000
model › notes	Fast mid-tier; 1M context.
effective cost per call	0.0086
cost per idea	0.0473
cost per validated trade	0.2365
cost per day	0.9460000000000001
cost per month	28.380000000000003
cost per year	345.29

Computed live at build time.

Same loop — Gemini 3.5 Flash (frontier, Flash latency)

Inputs
input_tokens_per_call	12000
output_tokens_per_call	2000
calls_per_idea	5
retry_rate	0.1
ideas_per_day	20
model_id	gemini-3-5-flash

Result
model › id	gemini-3-5-flash
model › provider	google
model › name	Gemini 3.5 Flash
model › input usd per mtoken	1.5
model › output usd per mtoken	9
model › context window	1000000
model › notes	Frontier agent-tier at Flash speed — not a budget model (output ~3.6x Gemini 2.5 Flash).
effective cost per call	0.036000000000000004
cost per idea	0.198
cost per validated trade	0.99
cost per day	3.96
cost per month	118.8
cost per year	1445.4

Computed live at build time.

Same loop — Gemini 2.5 Pro (frontier, 2M context)

Inputs
input_tokens_per_call	12000
output_tokens_per_call	2000
calls_per_idea	5
retry_rate	0.1
ideas_per_day	20
model_id	gemini-2-5-pro

Result
model › id	gemini-2-5-pro
model › provider	google
model › name	Gemini 2.5 Pro
model › input usd per mtoken	1.25
model › output usd per mtoken	10
model › context window	2000000
model › notes	Large context (2M). Strong on document analysis.
effective cost per call	0.035
cost per idea	0.1925
cost per validated trade	0.9625
cost per day	3.85
cost per month	115.5
cost per year	1405.25

Computed live at build time.

Frequently asked questions

What is the cheapest Gemini model for finance in 2026?: Gemini 2.5 Flash-Lite ($0.10/$0.40 per Mtok), at $6.60 per month on the research loop here — the budget floor of the lineup, with a 1M context window. Gemini 2.5 Flash ($28.38 per month) is the next rung up.
Is Gemini 3.5 Flash the budget option?: No. On this loop it costs $118.80 per month — about 18x Gemini 2.5 Flash-Lite and 4x Gemini 2.5 Flash. It is a frontier agent-tier model at Flash latency, priced like Gemini 2.5 Pro, not like the economy tier.
Gemini 3.5 Flash or Gemini 2.5 Pro?: They cost within 3% on the same loop ($118.80 vs $115.50 per month). Choose Gemini 3.5 Flash for Flash latency, Gemini 2.5 Pro for the 1M context window. Price barely decides it.
Is Google's benchmark claim for 3.5 Flash verified?: No. It is Google's own launch claim on coding and agentic benchmarks, not an independent finance-task eval, and none was run here. Confirm capability on your own task.
Where do these Gemini costs come from?: Each model's verified 2026-05-25 list rate, run through the Token Cost Optimizer on one fixed research loop. Recomputed from the shipped bundle, not a benchmark run.
What's the cheapest Gemini for a finance research loop under $10/month?: Gemini 2.5 Flash-Lite, at $6.60 per month on this loop, is the only rung under $10. The next rung, Gemini 2.5 Flash, is $28.38 per month — already over. Frontier rungs (Gemini 3.5 Flash $118.80, Gemini 2.5 Pro $115.50) are an order of magnitude above a $10 budget.
Which Gemini tier should I use for high-volume SEC filing extraction?: Gemini 2.5 Flash-Lite. It is the budget floor ($6.60 per month on this loop) and its 1M context still fits a full 10-K, so structural extraction work does not need a higher rung. Promote a step to Gemini 2.5 Flash ($28.38 per month) or a frontier rung only when an eval shows Flash-Lite actually misses on that step.
Is Gemini 3.5 Flash worth it over Gemini 2.5 Flash for mid-weight summarization?: Usually not. On this loop 3.5 Flash is $118.80 per month against $28.38 for 2.5 Flash — a 4x jump for the same token shape — and summarization rarely needs frontier reasoning. Reserve 3.5 Flash for steps that genuinely need agent-tier judgment at Flash latency; keep summarization on 2.5 Flash.