The short answer

Gemini 3.5 Flash (launched May 19, 2026) is a frontier agent-tier model at Flash latency, priced $1.50/$9.00 per Mtok. On a realistic finance research-agent loop the Agent Cost Envelope Calculator prices the month at $278.64, roughly the same envelope as Gemini 2.5 Pro and about 18x the genuine cheap tier, Gemini 2.5 Flash-Lite. Fast, not cheap.

Gemini 3.5 Flash (launched May 19, 2026 at Google I/O) is a frontier agent-tier model running at Flash latency, priced at $1.50 / $9.00 per Mtok. That is the draw and the trap: the intelligence-per-second is the reason to put it in a finance agent loop, but at $9.00 output it is not a budget model. On a realistic finance research-agent loop (8 steps, 9k in / 1.5k out per step, 40 markets/day, crypto calendar) the Agent Cost Envelope Calculator prices the loop at $0.2322 and the month at $278.64 — roughly the same envelope as Gemini 2.5 Pro ($270.90/mo) and about 18x the genuine cheap tier, Gemini 2.5 Flash-Lite ($15.48/mo). All figures below are computed live from the shipped engine bundle, not typed by hand.

TL;DR

Model $/Mtok in $/Mtok out Cost / loop Cost / month Tier
Gemini 2.5 Flash-Lite $0.10 $0.40 $0.0129 $15.48 economy
Gemini 2.5 Flash $0.30 $2.50 $0.0555 $66.56 economy
Gemini 2.5 Pro $1.25 $10.00 $0.2258 $270.90 frontier
Gemini 3.5 Flash $1.50 $9.00 $0.2322 $278.64 frontier
Claude Sonnet 4.6 $3.00 $15.00 $0.4257 $510.84 mid
Claude Opus 4.7 $5.00 $25.00 $0.7095 $851.40 frontier

Same agent loop for every row: 8 steps/loop, 9,000 input + 1,500 output tokens per step, a 60% convergence-check step, 40 markets/day, 30-day crypto calendar. Monthly cost is the engine's own output for that loop shape on each model's verified list rate, not a benchmark run.

Why "Flash" is doing a lot of work in the name

The Flash brand has meant "cheap and fast" since the 2.5 line. Gemini 2.5 Flash is $0.30 / $2.50 and Gemini 2.5 Flash-Lite is $0.10 / $0.40 — both economy-tier extraction workhorses. Gemini 3.5 Flash keeps the Flash latency profile but moves the price into the frontier band: $1.50 input is 5x Gemini 2.5 Flash, and $9.00 output is ~3.6x its $2.50 output.1

Put a number on it. On the loop above, swapping Gemini 2.5 Flash for Gemini 3.5 Flash takes the monthly bill from $66.56 to $278.64 — a 4.2x jump for the same token shape. The latency stays in Flash territory; the cost does not.

Google's frontier claim is a vendor claim

Google positioned Gemini 3.5 Flash as beating Gemini 3.1 Pro on coding and agentic benchmarks at launch. Treat that as a vendor benchmark, not an independently verified result. We have not benchmarked it, and no third-party finance-task eval was available at launch. The defensible reading: it is plausibly the strongest model in Google's lineup for agentic tool-use at this latency, and you should confirm that on your own task before you let the loop cost ride on it.

The honest cost ladder for a finance agent loop

The engine output below ranks all six models on one fixed loop. Read it as a ladder:

  • Genuine cheap tier Gemini 2.5 Flash-Lite at $15.48/mo and Gemini 2.5 Flash at $66.56/mo. If your agent's per-step work is extraction, classification, or routing, this is where it belongs.
  • Frontier tier (Gemini) Gemini 2.5 Pro at $270.90/mo and Gemini 3.5 Flash at $278.64/mo land within 3% of each other. Gemini 3.5 Flash buys you Flash latency at roughly Pro-tier cost; Gemini 2.5 Pro buys you a 2M context window at roughly the same cost.
  • Claude tier Sonnet 4.6 at $510.84/mo and Opus 4.7 at $851.40/mo2. Heavier on output, so an output-heavy agent loop costs proportionally more here.

The single most important comparison: Gemini 3.5 Flash costs about 18x what Gemini 2.5 Flash-Lite costs on the same loop ($278.64 vs $15.48). The "cheap Gemini" story is Flash-Lite. Gemini 3.5 Flash is the "agent-tier intelligence at Flash speed" story, and you pay frontier prices for it.

Output tokens dominate an agent loop — budget them

Agent loops are output-heavy relative to one-shot extraction: every step emits tool-call arguments plus reasoning, and a convergence step re-reads and re-summarizes. Because Gemini 3.5 Flash's output rate ($9.00) is 6x its input rate ($1.50), the loop cost is driven by how much the agent writes, not how much it reads. Two levers actually move the bill:

  1. Cap output per step. Tighter tool-call schemas and terse intermediate reasoning cut the dominant cost term directly.
  2. Tier the loop. Run extraction and routing steps on Flash-Lite, reserve Gemini 3.5 Flash for the steps that genuinely need agent-tier judgment. A loop that runs 6 cheap steps and 2 frontier steps lands far below an all-frontier loop.

The Token Cost Optimizer prices the per-call side of this with caching, and the Model Selector for Finance maps task tiers to models so you don't put a frontier model on an extraction step by accident.

Decision guidance

  • You need agent-tier reasoning at sub-second latency, and the budget is real Gemini 3.5 Flash. Budget ~$279/mo for the loop above and scale linearly with markets/day and steps/loop.
  • You want the cheapest Gemini that still fits a full filing in context Gemini 2.5 Flash-Lite (1M context, ~$15/mo on this loop). This is the budget option, not Gemini 3.5 Flash.
  • You need a 2M context window at frontier quality Gemini 2.5 Pro, at near-identical cost to Gemini 3.5 Flash but slower.
  • Your loop is output-heavy expect Claude tiers to cost proportionally more; price the exact shape before committing.

Connects to

References

Footnotes

  1. Google. "Gemini Developer API pricing." ai.google.dev, verified 2026-05-25. https://ai.google.dev/gemini-api/docs/pricing

  2. Anthropic. "Pricing." platform.claude.com, verified 2026-05-25. https://platform.claude.com/docs/en/about-claude/pricing

Verified engine output

Show the recompute-verified inputs and outputs
Finance agent loop — Gemini 3.5 Flash (8 steps, 40 markets/day, crypto calendar)
Inputs
input_tokens_per_step9000
output_tokens_per_step1500
steps_per_loop8
convergence_check_pct60
markets_per_day40
target_monthly_usd1500
calendar_modecrypto
model_idgemini-3-5-flash
Result
model › idgemini-3-5-flash
model › providergoogle
model › nameGemini 3.5 Flash
model › tierfrontier
model › input usd per mtoken1.5
model › output usd per mtoken9
model › context window1000000
model › notesFrontier agent-tier at Flash speed — output ~3.6x Gemini 2.5 Flash, not a budget pick.
steps (9 items)[...]
cost per loop0.23219999999999996
tool use subtotal0.21599999999999997
convergence cost0.016199999999999996
cost per day9.287999999999998
cost per month278.63999999999993
days per month30
tokens per loop90300
blended usd per1 ktokens0.0025714285714285713
within budgettrue
budget utilization0.18575999999999995

Computed live at build time.

Same loop — Gemini 2.5 Flash-Lite (genuine budget tier)
Inputs
input_tokens_per_step9000
output_tokens_per_step1500
steps_per_loop8
convergence_check_pct60
markets_per_day40
target_monthly_usd1500
calendar_modecrypto
model_idgemini-2-5-flash-lite
Result
model › idgemini-2-5-flash-lite
model › providergoogle
model › nameGemini 2.5 Flash-Lite
model › tiereconomy
model › input usd per mtoken0.1
model › output usd per mtoken0.4
model › context window1000000
model › notesCheapest tier in this table.
steps (9 items)[...]
cost per loop0.0129
tool use subtotal0.012
convergence cost0.0009
cost per day0.516
cost per month15.48
days per month30
tokens per loop90300
blended usd per1 ktokens0.00014285714285714284
within budgettrue
budget utilization0.010320000000000001

Computed live at build time.

Same loop — Gemini 2.5 Pro
Inputs
input_tokens_per_step9000
output_tokens_per_step1500
steps_per_loop8
convergence_check_pct60
markets_per_day40
target_monthly_usd1500
calendar_modecrypto
model_idgemini-2-5-pro
Result
model › idgemini-2-5-pro
model › providergoogle
model › nameGemini 2.5 Pro
model › tierfrontier
model › input usd per mtoken1.25
model › output usd per mtoken10
model › context window2000000
model › notesLargest context (2M).
steps (9 items)[...]
cost per loop0.22575
tool use subtotal0.21
convergence cost0.01575
cost per day9.030000000000001
cost per month270.90000000000003
days per month30
tokens per loop90300
blended usd per1 ktokens0.0025
within budgettrue
budget utilization0.1806

Computed live at build time.

Same loop — Gemini 2.5 Flash
Inputs
input_tokens_per_step9000
output_tokens_per_step1500
steps_per_loop8
convergence_check_pct60
markets_per_day40
target_monthly_usd1500
calendar_modecrypto
model_idgemini-2-5-flash
Result
model › idgemini-2-5-flash
model › providergoogle
model › nameGemini 2.5 Flash
model › tiereconomy
model › input usd per mtoken0.3
model › output usd per mtoken2.5
model › context window1000000
model › notesFast mid-tier; 1M context.
steps (9 items)[...]
cost per loop0.05546999999999999
tool use subtotal0.05159999999999999
convergence cost0.0038699999999999993
cost per day2.2188
cost per month66.564
days per month30
tokens per loop90300
blended usd per1 ktokens0.0006142857142857142
within budgettrue
budget utilization0.04437599999999999

Computed live at build time.

Same loop — Claude Sonnet 4.6
Inputs
input_tokens_per_step9000
output_tokens_per_step1500
steps_per_loop8
convergence_check_pct60
markets_per_day40
target_monthly_usd1500
calendar_modecrypto
model_idclaude-sonnet-4-6
Result
model › idclaude-sonnet-4-6
model › provideranthropic
model › nameClaude Sonnet 4.6
model › tiermid
model › input usd per mtoken3
model › output usd per mtoken15
model › cache read usd per mtoken0.3
model › context window500000
model › notesDefault pick for bulk research loops.
steps (9 items)[...]
cost per loop0.42569999999999997
tool use subtotal0.39599999999999996
convergence cost0.029699999999999997
cost per day17.028
cost per month510.84
days per month30
tokens per loop90300
blended usd per1 ktokens0.004714285714285714
within budgettrue
budget utilization0.34056

Computed live at build time.

Same loop — Claude Opus 4.7
Inputs
input_tokens_per_step9000
output_tokens_per_step1500
steps_per_loop8
convergence_check_pct60
markets_per_day40
target_monthly_usd1500
calendar_modecrypto
model_idclaude-opus-4-7
Result
model › idclaude-opus-4-7
model › provideranthropic
model › nameClaude Opus 4.7
model › tierfrontier
model › input usd per mtoken5
model › output usd per mtoken25
model › cache read usd per mtoken0.5
model › context window1000000
model › notesFrontier reasoning — 1M context.
steps (9 items)[...]
cost per loop0.7094999999999999
tool use subtotal0.6599999999999999
convergence cost0.049499999999999995
cost per day28.379999999999995
cost per month851.3999999999999
days per month30
tokens per loop90300
blended usd per1 ktokens0.007857142857142856
within budgettrue
budget utilization0.5675999999999999

Computed live at build time.

Frequently asked questions

Is Gemini 3.5 Flash a budget model?
No. At $1.50/$9.00 per Mtok its output rate is about 3.6x Gemini 2.5 Flash and roughly on par with Gemini 2.5 Pro. The genuine budget tier is Gemini 2.5 Flash-Lite ($0.10/$0.40). On a fixed finance agent loop, Gemini 3.5 Flash costs about 18x Flash-Lite.
What does a finance agent loop cost on Gemini 3.5 Flash?
On an 8-step loop (9k in / 1.5k out per step, 60% convergence check, 40 markets per day, 30-day calendar) the engine prices it at $0.2322 per loop and $278.64 per month. Cost scales linearly with steps, markets, and output tokens.
Is Google's claim that 3.5 Flash beats 3.1 Pro verified?
That is Google's own benchmark claim from the I/O launch. It has not been independently verified here, and no third-party finance-task eval was available at launch. Confirm it on your own task before relying on it.
Gemini 3.5 Flash or Gemini 2.5 Pro for a finance agent?
They cost within 3% on the same loop ($278.64 vs $270.90 per month). Pick Gemini 3.5 Flash for Flash latency, Gemini 2.5 Pro for the 2M context window. The price barely decides it.
Where do these numbers come from?
Each model's verified 2026-05-25 list rate, run through the Agent Cost Envelope Calculator on one fixed loop shape. The numbers are recomputed from the shipped bundle, not from a benchmark run.
What's the cheapest model for a finance agent loop under $50/month?
On this 8-step loop only Gemini 2.5 Flash-Lite ($15.48 per month) clears under $50. Gemini 2.5 Flash is next at $66.56 per month, already over. Everything above that — Gemini 2.5 Pro ($270.90), Gemini 3.5 Flash ($278.64), Sonnet 4.6 ($510.84), Opus 4.7 ($851.40) — is multiples above a $50 budget on the same loop shape.
Which Gemini tier should I use for agent loops on a tight budget?
Gemini 2.5 Flash-Lite for extraction, classification, and routing steps ($15.48 per month on this loop). Reserve Gemini 3.5 Flash ($278.64 per month) for the few steps that genuinely need agent-tier judgment. A tiered loop — most steps on Flash-Lite, a couple on 3.5 Flash — lands far below the $278.64 all-frontier figure.
Is Gemini 3.5 Flash worth it over Flash-Lite for a finance agent that mostly extracts and routes?
No. On the same loop, 3.5 Flash costs about 18x Flash-Lite ($278.64 vs $15.48 per month) and Flash-Lite still fits a full filing in its 1M context. Pay for 3.5 Flash only on steps that need agent-tier reasoning at Flash latency, not on extraction or routing.