The short answer
Gemini 3.5 Flash (launched May 19, 2026) is a frontier agent-tier model at Flash latency, priced $1.50/$9.00 per Mtok. On a realistic finance research-agent loop the Agent Cost Envelope Calculator prices the month at $278.64, roughly the same envelope as Gemini 2.5 Pro and about 18x the genuine cheap tier, Gemini 2.5 Flash-Lite. Fast, not cheap.
Gemini 3.5 Flash (launched May 19, 2026 at Google I/O) is a frontier agent-tier model running at Flash latency, priced at $1.50 / $9.00 per Mtok. That is the draw and the trap: the intelligence-per-second is the reason to put it in a finance agent loop, but at $9.00 output it is not a budget model. On a realistic finance research-agent loop (8 steps, 9k in / 1.5k out per step, 40 markets/day, crypto calendar) the Agent Cost Envelope Calculator prices the loop at $0.2322 and the month at $278.64 — roughly the same envelope as Gemini 2.5 Pro ($270.90/mo) and about 18x the genuine cheap tier, Gemini 2.5 Flash-Lite ($15.48/mo). All figures below are computed live from the shipped engine bundle, not typed by hand.
TL;DR
| Model | $/Mtok in | $/Mtok out | Cost / loop | Cost / month | Tier |
|---|---|---|---|---|---|
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | $0.0129 | $15.48 | economy |
| Gemini 2.5 Flash | $0.30 | $2.50 | $0.0555 | $66.56 | economy |
| Gemini 2.5 Pro | $1.25 | $10.00 | $0.2258 | $270.90 | frontier |
| Gemini 3.5 Flash | $1.50 | $9.00 | $0.2322 | $278.64 | frontier |
| Claude Sonnet 4.6 | $3.00 | $15.00 | $0.4257 | $510.84 | mid |
| Claude Opus 4.7 | $5.00 | $25.00 | $0.7095 | $851.40 | frontier |
Same agent loop for every row: 8 steps/loop, 9,000 input + 1,500 output tokens per step, a 60% convergence-check step, 40 markets/day, 30-day crypto calendar. Monthly cost is the engine's own output for that loop shape on each model's verified list rate, not a benchmark run.
Why "Flash" is doing a lot of work in the name
The Flash brand has meant "cheap and fast" since the 2.5 line. Gemini 2.5 Flash is $0.30 / $2.50 and Gemini 2.5 Flash-Lite is $0.10 / $0.40 — both economy-tier extraction workhorses. Gemini 3.5 Flash keeps the Flash latency profile but moves the price into the frontier band: $1.50 input is 5x Gemini 2.5 Flash, and $9.00 output is ~3.6x its $2.50 output.1
Put a number on it. On the loop above, swapping Gemini 2.5 Flash for Gemini 3.5 Flash takes the monthly bill from $66.56 to $278.64 — a 4.2x jump for the same token shape. The latency stays in Flash territory; the cost does not.
Google's frontier claim is a vendor claim
Google positioned Gemini 3.5 Flash as beating Gemini 3.1 Pro on coding and agentic benchmarks at launch. Treat that as a vendor benchmark, not an independently verified result. We have not benchmarked it, and no third-party finance-task eval was available at launch. The defensible reading: it is plausibly the strongest model in Google's lineup for agentic tool-use at this latency, and you should confirm that on your own task before you let the loop cost ride on it.
The honest cost ladder for a finance agent loop
The engine output below ranks all six models on one fixed loop. Read it as a ladder:
- Genuine cheap tier Gemini 2.5 Flash-Lite at $15.48/mo and Gemini 2.5 Flash at $66.56/mo. If your agent's per-step work is extraction, classification, or routing, this is where it belongs.
- Frontier tier (Gemini) Gemini 2.5 Pro at $270.90/mo and Gemini 3.5 Flash at $278.64/mo land within 3% of each other. Gemini 3.5 Flash buys you Flash latency at roughly Pro-tier cost; Gemini 2.5 Pro buys you a 2M context window at roughly the same cost.
- Claude tier Sonnet 4.6 at $510.84/mo and Opus 4.7 at $851.40/mo2. Heavier on output, so an output-heavy agent loop costs proportionally more here.
The single most important comparison: Gemini 3.5 Flash costs about 18x what Gemini 2.5 Flash-Lite costs on the same loop ($278.64 vs $15.48). The "cheap Gemini" story is Flash-Lite. Gemini 3.5 Flash is the "agent-tier intelligence at Flash speed" story, and you pay frontier prices for it.
Output tokens dominate an agent loop — budget them
Agent loops are output-heavy relative to one-shot extraction: every step emits tool-call arguments plus reasoning, and a convergence step re-reads and re-summarizes. Because Gemini 3.5 Flash's output rate ($9.00) is 6x its input rate ($1.50), the loop cost is driven by how much the agent writes, not how much it reads. Two levers actually move the bill:
- Cap output per step. Tighter tool-call schemas and terse intermediate reasoning cut the dominant cost term directly.
- Tier the loop. Run extraction and routing steps on Flash-Lite, reserve Gemini 3.5 Flash for the steps that genuinely need agent-tier judgment. A loop that runs 6 cheap steps and 2 frontier steps lands far below an all-frontier loop.
The Token Cost Optimizer prices the per-call side of this with caching, and the Model Selector for Finance maps task tiers to models so you don't put a frontier model on an extraction step by accident.
Decision guidance
- You need agent-tier reasoning at sub-second latency, and the budget is real Gemini 3.5 Flash. Budget ~$279/mo for the loop above and scale linearly with markets/day and steps/loop.
- You want the cheapest Gemini that still fits a full filing in context Gemini 2.5 Flash-Lite (1M context, ~$15/mo on this loop). This is the budget option, not Gemini 3.5 Flash.
- You need a 2M context window at frontier quality Gemini 2.5 Pro, at near-identical cost to Gemini 3.5 Flash but slower.
- Your loop is output-heavy expect Claude tiers to cost proportionally more; price the exact shape before committing.
Connects to
- Agent Cost Envelope Calculator: the loop-cost engine behind every number here.
- Token cost reality for LLM trading research: the per-call cost discipline this loop sits on top of.
- Claude vs GPT-5 vs Gemini for Financial Analysis 2026: the tier-vs-vendor framing across all three families.
- Best LLM for Financial Analysis 2026: the task-tiered pillar that places each model.
References
Footnotes
-
Google. "Gemini Developer API pricing." ai.google.dev, verified 2026-05-25. https://ai.google.dev/gemini-api/docs/pricing ↩
-
Anthropic. "Pricing." platform.claude.com, verified 2026-05-25. https://platform.claude.com/docs/en/about-claude/pricing ↩
Verified engine output
Show the recompute-verified inputs and outputs
| input_tokens_per_step | 9000 |
|---|---|
| output_tokens_per_step | 1500 |
| steps_per_loop | 8 |
| convergence_check_pct | 60 |
| markets_per_day | 40 |
| target_monthly_usd | 1500 |
| calendar_mode | crypto |
| model_id | gemini-3-5-flash |
| model › id | gemini-3-5-flash |
|---|---|
| model › provider | |
| model › name | Gemini 3.5 Flash |
| model › tier | frontier |
| model › input usd per mtoken | 1.5 |
| model › output usd per mtoken | 9 |
| model › context window | 1000000 |
| model › notes | Frontier agent-tier at Flash speed — output ~3.6x Gemini 2.5 Flash, not a budget pick. |
| steps (9 items) | [...] |
| cost per loop | 0.23219999999999996 |
| tool use subtotal | 0.21599999999999997 |
| convergence cost | 0.016199999999999996 |
| cost per day | 9.287999999999998 |
| cost per month | 278.63999999999993 |
| days per month | 30 |
| tokens per loop | 90300 |
| blended usd per1 ktokens | 0.0025714285714285713 |
| within budget | true |
| budget utilization | 0.18575999999999995 |
Computed live at build time.
| input_tokens_per_step | 9000 |
|---|---|
| output_tokens_per_step | 1500 |
| steps_per_loop | 8 |
| convergence_check_pct | 60 |
| markets_per_day | 40 |
| target_monthly_usd | 1500 |
| calendar_mode | crypto |
| model_id | gemini-2-5-flash-lite |
| model › id | gemini-2-5-flash-lite |
|---|---|
| model › provider | |
| model › name | Gemini 2.5 Flash-Lite |
| model › tier | economy |
| model › input usd per mtoken | 0.1 |
| model › output usd per mtoken | 0.4 |
| model › context window | 1000000 |
| model › notes | Cheapest tier in this table. |
| steps (9 items) | [...] |
| cost per loop | 0.0129 |
| tool use subtotal | 0.012 |
| convergence cost | 0.0009 |
| cost per day | 0.516 |
| cost per month | 15.48 |
| days per month | 30 |
| tokens per loop | 90300 |
| blended usd per1 ktokens | 0.00014285714285714284 |
| within budget | true |
| budget utilization | 0.010320000000000001 |
Computed live at build time.
| input_tokens_per_step | 9000 |
|---|---|
| output_tokens_per_step | 1500 |
| steps_per_loop | 8 |
| convergence_check_pct | 60 |
| markets_per_day | 40 |
| target_monthly_usd | 1500 |
| calendar_mode | crypto |
| model_id | gemini-2-5-pro |
| model › id | gemini-2-5-pro |
|---|---|
| model › provider | |
| model › name | Gemini 2.5 Pro |
| model › tier | frontier |
| model › input usd per mtoken | 1.25 |
| model › output usd per mtoken | 10 |
| model › context window | 2000000 |
| model › notes | Largest context (2M). |
| steps (9 items) | [...] |
| cost per loop | 0.22575 |
| tool use subtotal | 0.21 |
| convergence cost | 0.01575 |
| cost per day | 9.030000000000001 |
| cost per month | 270.90000000000003 |
| days per month | 30 |
| tokens per loop | 90300 |
| blended usd per1 ktokens | 0.0025 |
| within budget | true |
| budget utilization | 0.1806 |
Computed live at build time.
| input_tokens_per_step | 9000 |
|---|---|
| output_tokens_per_step | 1500 |
| steps_per_loop | 8 |
| convergence_check_pct | 60 |
| markets_per_day | 40 |
| target_monthly_usd | 1500 |
| calendar_mode | crypto |
| model_id | gemini-2-5-flash |
| model › id | gemini-2-5-flash |
|---|---|
| model › provider | |
| model › name | Gemini 2.5 Flash |
| model › tier | economy |
| model › input usd per mtoken | 0.3 |
| model › output usd per mtoken | 2.5 |
| model › context window | 1000000 |
| model › notes | Fast mid-tier; 1M context. |
| steps (9 items) | [...] |
| cost per loop | 0.05546999999999999 |
| tool use subtotal | 0.05159999999999999 |
| convergence cost | 0.0038699999999999993 |
| cost per day | 2.2188 |
| cost per month | 66.564 |
| days per month | 30 |
| tokens per loop | 90300 |
| blended usd per1 ktokens | 0.0006142857142857142 |
| within budget | true |
| budget utilization | 0.04437599999999999 |
Computed live at build time.
| input_tokens_per_step | 9000 |
|---|---|
| output_tokens_per_step | 1500 |
| steps_per_loop | 8 |
| convergence_check_pct | 60 |
| markets_per_day | 40 |
| target_monthly_usd | 1500 |
| calendar_mode | crypto |
| model_id | claude-sonnet-4-6 |
| model › id | claude-sonnet-4-6 |
|---|---|
| model › provider | anthropic |
| model › name | Claude Sonnet 4.6 |
| model › tier | mid |
| model › input usd per mtoken | 3 |
| model › output usd per mtoken | 15 |
| model › cache read usd per mtoken | 0.3 |
| model › context window | 500000 |
| model › notes | Default pick for bulk research loops. |
| steps (9 items) | [...] |
| cost per loop | 0.42569999999999997 |
| tool use subtotal | 0.39599999999999996 |
| convergence cost | 0.029699999999999997 |
| cost per day | 17.028 |
| cost per month | 510.84 |
| days per month | 30 |
| tokens per loop | 90300 |
| blended usd per1 ktokens | 0.004714285714285714 |
| within budget | true |
| budget utilization | 0.34056 |
Computed live at build time.
| input_tokens_per_step | 9000 |
|---|---|
| output_tokens_per_step | 1500 |
| steps_per_loop | 8 |
| convergence_check_pct | 60 |
| markets_per_day | 40 |
| target_monthly_usd | 1500 |
| calendar_mode | crypto |
| model_id | claude-opus-4-7 |
| model › id | claude-opus-4-7 |
|---|---|
| model › provider | anthropic |
| model › name | Claude Opus 4.7 |
| model › tier | frontier |
| model › input usd per mtoken | 5 |
| model › output usd per mtoken | 25 |
| model › cache read usd per mtoken | 0.5 |
| model › context window | 1000000 |
| model › notes | Frontier reasoning — 1M context. |
| steps (9 items) | [...] |
| cost per loop | 0.7094999999999999 |
| tool use subtotal | 0.6599999999999999 |
| convergence cost | 0.049499999999999995 |
| cost per day | 28.379999999999995 |
| cost per month | 851.3999999999999 |
| days per month | 30 |
| tokens per loop | 90300 |
| blended usd per1 ktokens | 0.007857142857142856 |
| within budget | true |
| budget utilization | 0.5675999999999999 |
Computed live at build time.
Frequently asked questions
- Is Gemini 3.5 Flash a budget model?
- No. At $1.50/$9.00 per Mtok its output rate is about 3.6x Gemini 2.5 Flash and roughly on par with Gemini 2.5 Pro. The genuine budget tier is Gemini 2.5 Flash-Lite ($0.10/$0.40). On a fixed finance agent loop, Gemini 3.5 Flash costs about 18x Flash-Lite.
- What does a finance agent loop cost on Gemini 3.5 Flash?
- On an 8-step loop (9k in / 1.5k out per step, 60% convergence check, 40 markets per day, 30-day calendar) the engine prices it at $0.2322 per loop and $278.64 per month. Cost scales linearly with steps, markets, and output tokens.
- Is Google's claim that 3.5 Flash beats 3.1 Pro verified?
- That is Google's own benchmark claim from the I/O launch. It has not been independently verified here, and no third-party finance-task eval was available at launch. Confirm it on your own task before relying on it.
- Gemini 3.5 Flash or Gemini 2.5 Pro for a finance agent?
- They cost within 3% on the same loop ($278.64 vs $270.90 per month). Pick Gemini 3.5 Flash for Flash latency, Gemini 2.5 Pro for the 2M context window. The price barely decides it.
- Where do these numbers come from?
- Each model's verified 2026-05-25 list rate, run through the Agent Cost Envelope Calculator on one fixed loop shape. The numbers are recomputed from the shipped bundle, not from a benchmark run.
- What's the cheapest model for a finance agent loop under $50/month?
- On this 8-step loop only Gemini 2.5 Flash-Lite ($15.48 per month) clears under $50. Gemini 2.5 Flash is next at $66.56 per month, already over. Everything above that — Gemini 2.5 Pro ($270.90), Gemini 3.5 Flash ($278.64), Sonnet 4.6 ($510.84), Opus 4.7 ($851.40) — is multiples above a $50 budget on the same loop shape.
- Which Gemini tier should I use for agent loops on a tight budget?
- Gemini 2.5 Flash-Lite for extraction, classification, and routing steps ($15.48 per month on this loop). Reserve Gemini 3.5 Flash ($278.64 per month) for the few steps that genuinely need agent-tier judgment. A tiered loop — most steps on Flash-Lite, a couple on 3.5 Flash — lands far below the $278.64 all-frontier figure.
- Is Gemini 3.5 Flash worth it over Flash-Lite for a finance agent that mostly extracts and routes?
- No. On the same loop, 3.5 Flash costs about 18x Flash-Lite ($278.64 vs $15.48 per month) and Flash-Lite still fits a full filing in its 1M context. Pay for 3.5 Flash only on steps that need agent-tier reasoning at Flash latency, not on extraction or routing.