Is Gemini 3.5 Flash a budget model?

No. At $1.50/$9.00 per Mtok its output rate is about 3.6x Gemini 2.5 Flash and roughly on par with Gemini 2.5 Pro. The genuine budget tier is Gemini 2.5 Flash-Lite ($0.10/$0.40). On a fixed finance agent loop, Gemini 3.5 Flash costs about 18x Flash-Lite.

What does a finance agent loop cost on Gemini 3.5 Flash?

On an 8-step loop (9k in / 1.5k out per step, 60% convergence check, 40 markets per day, 30-day calendar) the engine prices it at $0.2322 per loop and $278.64 per month. Cost scales linearly with steps, markets, and output tokens.

Is Google's claim that 3.5 Flash beats 3.1 Pro verified?

That is Google's own benchmark claim from the I/O launch. It has not been independently verified here, and no third-party finance-task eval was available at launch. Confirm it on your own task before relying on it.

Gemini 3.5 Flash or Gemini 2.5 Pro for a finance agent?

They cost within 3% on the same loop ($278.64 vs $270.90 per month). Pick Gemini 3.5 Flash for Flash latency, Gemini 2.5 Pro for the 2M context window. The price barely decides it.

Where do these numbers come from?

Each model's verified 2026-05-25 list rate, run through the Agent Cost Envelope Calculator on one fixed loop shape. The numbers are recomputed from the shipped bundle, not from a benchmark run.

What's the cheapest model for a finance agent loop under $50/month?

On this 8-step loop only Gemini 2.5 Flash-Lite ($15.48 per month) clears under $50. Gemini 2.5 Flash is next at $66.56 per month, already over. Everything above that — Gemini 2.5 Pro ($270.90), Gemini 3.5 Flash ($278.64), Sonnet 4.6 ($510.84), Opus 4.8 ($851.40) — is multiples above a $50 budget on the same loop shape.

Which Gemini tier should I use for agent loops on a tight budget?

Gemini 2.5 Flash-Lite for extraction, classification, and routing steps ($15.48 per month on this loop). Reserve Gemini 3.5 Flash ($278.64 per month) for the few steps that genuinely need agent-tier judgment. A tiered loop — most steps on Flash-Lite, a couple on 3.5 Flash — lands far below the $278.64 all-frontier figure.

Is Gemini 3.5 Flash worth it over Flash-Lite for a finance agent that mostly extracts and routes?

No. On the same loop, 3.5 Flash costs about 18x Flash-Lite ($278.64 vs $15.48 per month) and Flash-Lite still fits a full filing in its 1M context. Pay for 3.5 Flash only on steps that need agent-tier reasoning at Flash latency, not on extraction or routing.

Gemini 3.5 Flash for Financial Agents: The Cost Reality 2026

The short answer

Gemini 3.5 Flash (launched May 19, 2026) is a frontier agent-tier model at Flash latency, priced $1.50/$9.00 per Mtok. On a realistic finance research-agent loop the Agent Cost Envelope Calculator prices the month at $278.64, roughly the same envelope as Gemini 2.5 Pro and about 18x the genuine cheap tier, Gemini 2.5 Flash-Lite. Fast, not cheap.

Gemini 3.5 Flash (launched May 19, 2026 at Google I/O) is a frontier agent-tier model running at Flash latency, priced at $1.50 / $9.00 per Mtok. That is the draw and the trap: the intelligence-per-second is the reason to put it in a finance agent loop, but at $9.00 output it is not a budget model. On a realistic finance research-agent loop (8 steps, 9k in / 1.5k out per step, 40 markets/day, crypto calendar) the Agent Cost Envelope Calculator prices the loop at $0.2322 and the month at $278.64 — roughly the same envelope as Gemini 2.5 Pro ($270.90/mo) and about 18x the genuine cheap tier, Gemini 2.5 Flash-Lite ($15.48/mo). All figures below are computed live from the shipped engine bundle, not typed by hand.

TL;DR

Model	$/Mtok in	$/Mtok out	Cost / loop	Cost / month	Tier
Gemini 2.5 Flash-Lite	$0.10	$0.40	$0.0129	$15.48	economy
Gemini 2.5 Flash	$0.30	$2.50	$0.0555	$66.56	economy
Gemini 2.5 Pro	$1.25	$10.00	$0.2258	$270.90	frontier
Gemini 3.5 Flash	$1.50	$9.00	$0.2322	$278.64	frontier
Claude Sonnet 4.6	$3.00	$15.00	$0.4257	$510.84	mid
Claude Opus 4.8	$5.00	$25.00	$0.7095	$851.40	frontier

Same agent loop for every row: 8 steps/loop, 9,000 input + 1,500 output tokens per step, a 60% convergence-check step, 40 markets/day, 30-day crypto calendar. Monthly cost is the engine's own output for that loop shape on each model's verified list rate, not a benchmark run.

Why "Flash" is doing a lot of work in the name

The Flash brand has meant "cheap and fast" since the 2.5 line. Gemini 2.5 Flash is $0.30 / $2.50 and Gemini 2.5 Flash-Lite is $0.10 / $0.40 — both economy-tier extraction workhorses. Gemini 3.5 Flash keeps the Flash latency profile but moves the price into the frontier band: $1.50 input is 5x Gemini 2.5 Flash, and $9.00 output is ~3.6x its $2.50 output.¹

Put a number on it. On the loop above, swapping Gemini 2.5 Flash for Gemini 3.5 Flash takes the monthly bill from $66.56 to $278.64 — a 4.2x jump for the same token shape. The latency stays in Flash territory; the cost does not.

Google's frontier claim is a vendor claim

Google positioned Gemini 3.5 Flash as beating Gemini 3.1 Pro on coding and agentic benchmarks at launch. Treat that as a vendor benchmark, not an independently verified result. We have not benchmarked it, and no third-party finance-task eval was available at launch. The defensible reading: it is plausibly the strongest model in Google's lineup for agentic tool-use at this latency, and you should confirm that on your own task before you let the loop cost ride on it.

The honest cost ladder for a finance agent loop

The engine output below ranks all six models on one fixed loop. Read it as a ladder:

Genuine cheap tier Gemini 2.5 Flash-Lite at $15.48/mo and Gemini 2.5 Flash at $66.56/mo. If your agent's per-step work is extraction, classification, or routing, this is where it belongs.
Frontier tier (Gemini) Gemini 2.5 Pro at $270.90/mo and Gemini 3.5 Flash at $278.64/mo land within 3% of each other. Gemini 3.5 Flash buys you Flash latency at roughly Pro-tier cost; Gemini 2.5 Pro buys you a 2M context window at roughly the same cost.
Claude tier Sonnet 4.6 at $510.84/mo and Opus 4.8 at $851.40/mo². Heavier on output, so an output-heavy agent loop costs proportionally more here.

The single most important comparison: Gemini 3.5 Flash costs about 18x what Gemini 2.5 Flash-Lite costs on the same loop ($278.64 vs $15.48). The "cheap Gemini" story is Flash-Lite. Gemini 3.5 Flash is the "agent-tier intelligence at Flash speed" story, and you pay frontier prices for it.

Output tokens dominate an agent loop — budget them

Agent loops are output-heavy relative to one-shot extraction: every step emits tool-call arguments plus reasoning, and a convergence step re-reads and re-summarizes. Because Gemini 3.5 Flash's output rate ($9.00) is 6x its input rate ($1.50), the loop cost is driven by how much the agent writes, not how much it reads. Two levers actually move the bill:

Cap output per step. Tighter tool-call schemas and terse intermediate reasoning cut the dominant cost term directly.
Tier the loop. Run extraction and routing steps on Flash-Lite, reserve Gemini 3.5 Flash for the steps that genuinely need agent-tier judgment. A loop that runs 6 cheap steps and 2 frontier steps lands far below an all-frontier loop.

The Token Cost Optimizer prices the per-call side of this with caching, and the Model Selector for Finance maps task tiers to models so you don't put a frontier model on an extraction step by accident.

Decision guidance

You need agent-tier reasoning at sub-second latency, and the budget is real Gemini 3.5 Flash. Budget ~$279/mo for the loop above and scale linearly with markets/day and steps/loop.
You want the cheapest Gemini that still fits a full filing in context Gemini 2.5 Flash-Lite (1M context, ~$15/mo on this loop). This is the budget option, not Gemini 3.5 Flash.
You need a 2M context window at frontier quality Gemini 2.5 Pro, at near-identical cost to Gemini 3.5 Flash but slower.
Your loop is output-heavy expect Claude tiers to cost proportionally more; price the exact shape before committing.

Connects to

Agent Cost Envelope Calculator: the loop-cost engine behind every number here.
Token cost reality for LLM trading research: the per-call cost discipline this loop sits on top of.
Claude vs GPT-5 vs Gemini for Financial Analysis 2026: the tier-vs-vendor framing across all three families.
Best LLM for Financial Analysis 2026: the task-tiered pillar that places each model.

References

Google. "Gemini Developer API pricing." ai.google.dev, verified 2026-05-25. https://ai.google.dev/gemini-api/docs/pricing ↩
Anthropic. "Pricing." platform.claude.com, verified 2026-06-18. https://platform.claude.com/docs/en/about-claude/pricing ↩

Verified engine output

Show the recompute-verified inputs and outputs

Finance agent loop — Gemini 3.5 Flash (8 steps, 40 markets/day, crypto calendar)

Inputs
input_tokens_per_step	9000
output_tokens_per_step	1500
steps_per_loop	8
convergence_check_pct	60
markets_per_day	40
target_monthly_usd	1500
calendar_mode	crypto
model_id	gemini-3-5-flash

Result
model › id	gemini-3-5-flash
model › provider	google
model › name	Gemini 3.5 Flash
model › tier	frontier
model › input usd per mtoken	1.5
model › output usd per mtoken	9
model › context window	1000000
model › notes	Frontier agent-tier at Flash speed — output ~3.6x Gemini 2.5 Flash, not a budget pick.
steps (9 items)	[...]
cost per loop	0.23219999999999996
tool use subtotal	0.21599999999999997
convergence cost	0.016199999999999996
cost per day	9.287999999999998
cost per month	278.63999999999993
days per month	30
tokens per loop	90300
blended usd per1 ktokens	0.0025714285714285713
within budget	true
budget utilization	0.18575999999999995

Computed live at build time.

Same loop — Gemini 2.5 Flash-Lite (genuine budget tier)

Inputs
input_tokens_per_step	9000
output_tokens_per_step	1500
steps_per_loop	8
convergence_check_pct	60
markets_per_day	40
target_monthly_usd	1500
calendar_mode	crypto
model_id	gemini-2-5-flash-lite

Result
model › id	gemini-2-5-flash-lite
model › provider	google
model › name	Gemini 2.5 Flash-Lite
model › tier	economy
model › input usd per mtoken	0.1
model › output usd per mtoken	0.4
model › context window	1000000
model › notes	Cheapest tier in this table.
steps (9 items)	[...]
cost per loop	0.0129
tool use subtotal	0.012
convergence cost	0.0009
cost per day	0.516
cost per month	15.48
days per month	30
tokens per loop	90300
blended usd per1 ktokens	0.00014285714285714284
within budget	true
budget utilization	0.010320000000000001

Computed live at build time.

Same loop — Gemini 2.5 Pro

Inputs
input_tokens_per_step	9000
output_tokens_per_step	1500
steps_per_loop	8
convergence_check_pct	60
markets_per_day	40
target_monthly_usd	1500
calendar_mode	crypto
model_id	gemini-2-5-pro

Result
model › id	gemini-2-5-pro
model › provider	google
model › name	Gemini 2.5 Pro
model › tier	frontier
model › input usd per mtoken	1.25
model › output usd per mtoken	10
model › context window	2000000
model › notes	Largest context (2M).
steps (9 items)	[...]
cost per loop	0.22575
tool use subtotal	0.21
convergence cost	0.01575
cost per day	9.030000000000001
cost per month	270.90000000000003
days per month	30
tokens per loop	90300
blended usd per1 ktokens	0.0025
within budget	true
budget utilization	0.1806

Computed live at build time.

Same loop — Gemini 2.5 Flash

Inputs
input_tokens_per_step	9000
output_tokens_per_step	1500
steps_per_loop	8
convergence_check_pct	60
markets_per_day	40
target_monthly_usd	1500
calendar_mode	crypto
model_id	gemini-2-5-flash

Result
model › id	gemini-2-5-flash
model › provider	google
model › name	Gemini 2.5 Flash
model › tier	economy
model › input usd per mtoken	0.3
model › output usd per mtoken	2.5
model › context window	1000000
model › notes	Fast mid-tier; 1M context.
steps (9 items)	[...]
cost per loop	0.05546999999999999
tool use subtotal	0.05159999999999999
convergence cost	0.0038699999999999993
cost per day	2.2188
cost per month	66.564
days per month	30
tokens per loop	90300
blended usd per1 ktokens	0.0006142857142857142
within budget	true
budget utilization	0.04437599999999999

Computed live at build time.

Same loop — Claude Sonnet 4.6

Inputs
input_tokens_per_step	9000
output_tokens_per_step	1500
steps_per_loop	8
convergence_check_pct	60
markets_per_day	40
target_monthly_usd	1500
calendar_mode	crypto
model_id	claude-sonnet-4-6

Result
model › id	claude-sonnet-4-6
model › provider	anthropic
model › name	Claude Sonnet 4.6
model › tier	mid
model › input usd per mtoken	3
model › output usd per mtoken	15
model › cache read usd per mtoken	0.3
model › context window	500000
model › notes	Default pick for bulk research loops.
steps (9 items)	[...]
cost per loop	0.42569999999999997
tool use subtotal	0.39599999999999996
convergence cost	0.029699999999999997
cost per day	17.028
cost per month	510.84
days per month	30
tokens per loop	90300
blended usd per1 ktokens	0.004714285714285714
within budget	true
budget utilization	0.34056

Computed live at build time.

Same loop — Claude Opus 4.8

Inputs
input_tokens_per_step	9000
output_tokens_per_step	1500
steps_per_loop	8
convergence_check_pct	60
markets_per_day	40
target_monthly_usd	1500
calendar_mode	crypto
model_id	claude-opus-4-8

Result
model › id	claude-opus-4-8
model › provider	anthropic
model › name	Claude Opus 4.8
model › tier	frontier
model › input usd per mtoken	5
model › output usd per mtoken	25
model › cache read usd per mtoken	0.5
model › context window	1000000
model › notes	Frontier reasoning — 1M context.
steps (9 items)	[...]
cost per loop	0.7094999999999999
tool use subtotal	0.6599999999999999
convergence cost	0.049499999999999995
cost per day	28.379999999999995
cost per month	851.3999999999999
days per month	30
tokens per loop	90300
blended usd per1 ktokens	0.007857142857142856
within budget	true
budget utilization	0.5675999999999999

Computed live at build time.

Frequently asked questions

Is Gemini 3.5 Flash a budget model?: No. At $1.50/$9.00 per Mtok its output rate is about 3.6x Gemini 2.5 Flash and roughly on par with Gemini 2.5 Pro. The genuine budget tier is Gemini 2.5 Flash-Lite ($0.10/$0.40). On a fixed finance agent loop, Gemini 3.5 Flash costs about 18x Flash-Lite.
What does a finance agent loop cost on Gemini 3.5 Flash?: On an 8-step loop (9k in / 1.5k out per step, 60% convergence check, 40 markets per day, 30-day calendar) the engine prices it at $0.2322 per loop and $278.64 per month. Cost scales linearly with steps, markets, and output tokens.
Is Google's claim that 3.5 Flash beats 3.1 Pro verified?: That is Google's own benchmark claim from the I/O launch. It has not been independently verified here, and no third-party finance-task eval was available at launch. Confirm it on your own task before relying on it.
Gemini 3.5 Flash or Gemini 2.5 Pro for a finance agent?: They cost within 3% on the same loop ($278.64 vs $270.90 per month). Pick Gemini 3.5 Flash for Flash latency, Gemini 2.5 Pro for the 2M context window. The price barely decides it.
Where do these numbers come from?: Each model's verified 2026-05-25 list rate, run through the Agent Cost Envelope Calculator on one fixed loop shape. The numbers are recomputed from the shipped bundle, not from a benchmark run.
What's the cheapest model for a finance agent loop under $50/month?: On this 8-step loop only Gemini 2.5 Flash-Lite ($15.48 per month) clears under $50. Gemini 2.5 Flash is next at $66.56 per month, already over. Everything above that — Gemini 2.5 Pro ($270.90), Gemini 3.5 Flash ($278.64), Sonnet 4.6 ($510.84), Opus 4.8 ($851.40) — is multiples above a $50 budget on the same loop shape.
Which Gemini tier should I use for agent loops on a tight budget?: Gemini 2.5 Flash-Lite for extraction, classification, and routing steps ($15.48 per month on this loop). Reserve Gemini 3.5 Flash ($278.64 per month) for the few steps that genuinely need agent-tier judgment. A tiered loop — most steps on Flash-Lite, a couple on 3.5 Flash — lands far below the $278.64 all-frontier figure.
Is Gemini 3.5 Flash worth it over Flash-Lite for a finance agent that mostly extracts and routes?: No. On the same loop, 3.5 Flash costs about 18x Flash-Lite ($278.64 vs $15.48 per month) and Flash-Lite still fits a full filing in its 1M context. Pay for 3.5 Flash only on steps that need agent-tier reasoning at Flash latency, not on extraction or routing.

TL;DR

Why "Flash" is doing a lot of work in the name

Google's frontier claim is a vendor claim

The honest cost ladder for a finance agent loop

Output tokens dominate an agent loop — budget them

Decision guidance

Connects to

References

Footnotes

Verified engine output

Frequently asked questions