Calculator

Agent Cost Envelope Calculator

Name: Agent Cost Envelope Calculator
Author: AI Fin Hub Research

Agent cost envelope for LLM research loops — model, tokens per step, tool-use steps, convergence check, markets per day → per-loop, daily, monthly cost.

AI Fin Hub Research Published Apr 23, 2026 Methodology Corrections

Transparent by design — computed in your browser from a published formula and sourced rates, not a black box. Data verified May 25, 2026. Sources: Anthropic pricing ↗ · OpenAI pricing ↗ · Google AI / Gemini pricing ↗ Full methodology →

Inputs: Form inputs / CSV
Runtime: Instant
Privacy: Client-side · no upload
API key: Not required
Methodology: Open →

Education · Not investment advice. BaFin/EU framework. Past performance does not indicate future results. Editorial standards Sponsor disclosure Corrections

1 · Configure your agent loop

Primary modelInput tokens per stepOutput tokens per stepSteps per loop (tool-use)

Convergence-check cost10% of a step

Markets analyzed per dayTarget monthly budget (USD)

Calendar mode

Monthly cost

$52/mo

Claude Sonnet 4.6 · 10 markets/day · 22 days · inside $200 target

Per loop: $0.237 · Per day: $2.37 · Utilisation: 26%

Cap recommendation

Per-loop budget

$0.909

$200 / month ÷ 220 loops

Max tokens per loop

185.7K

Current: 48.5K

Suggested cap per step

36.4K

You have headroom

2 · Per-loop breakdown

Step	Input cost	Output cost	Total
Step 1 — tool call + reasoning	$0.024	$0.022	$0.046
Step 2 — tool call + reasoning	$0.024	$0.022	$0.046
Step 3 — tool call + reasoning	$0.024	$0.022	$0.046
Step 4 — tool call + reasoning	$0.024	$0.022	$0.046
Step 5 — tool call + reasoning	$0.024	$0.022	$0.046
Convergence check — final analysisconvergence	$0.00240	$0.00225	$0.00465
Per-loop total			$0.237

3 · Sensitivity — swap model tier

Same inputs, every model. Shows what the envelope becomes if you move tier — and which of the 8 fits your target budget.

Model	Tier	Per loop	Per day	Per month	vs target
Gemini 2.5 Flash-Litegoogle	economy	$0.00714	$0.071	$2	inside
Gemini 2.5 Flashgoogle	economy	$0.031	$0.314	$7	inside
GPT-5.4 miniopenai	mid	$0.065	$0.650	$14	inside
Claude Haiku 4.5anthropic	economy	$0.079	$0.790	$17	inside
Gemini 2.5 Progoogle	frontier	$0.128	$1.27	$28	inside
Gemini 3.5 Flashgoogle	frontier	$0.130	$1.30	$29	inside
o4-mini (reasoning)openai	mid	$0.214	$2.14	$47	inside
Claude Sonnet 4.6anthropicprimary	mid	$0.237	$2.37	$52	inside
Claude Opus 4.8anthropic	frontier	$0.395	$3.95	$87	inside
GPT-5.5openai	frontier	$0.433	$4.33	$95	inside

How the envelope is priced

step_cost          = input_tokens × in_rate + output_tokens × out_rate
convergence_cost   = step_cost × convergence_pct
loop_cost          = steps × step_cost + convergence_cost
daily_cost         = loop_cost × markets_per_day
monthly_cost       = daily_cost × (22 biz | 30 crypto)
budget_per_loop    = target_monthly / (markets_per_day × days_per_month)
max_tokens_per_loop = budget_per_loop / blended_$_per_token

Pricing verified 2026-04-23. See methodology for the full rate table, calendar-mode rationale, and limitations.

How to use

Step-by-step

Full calculator guide →

1
Enter your model selection, prompt length, output length, and expected call volume per task.
2
Set retry assumptions: max retries on error, max retries on timeout, and the cache hit rate for the prompt prefix.
3
Read the cost envelope: minimum (best case), median (typical), and 95th percentile (pessimistic). Budget against the 95th percentile if your workflow has heavy-tailed failure modes.
4
Toggle 'fallback chain enabled' to model what happens when the primary model fails. Fallback adds reliability at a multiplied cost.
5
Compare envelopes across model choices. The cheapest model isn't always the cheapest envelope — frequent retries on a weak model can exceed the cost of a stronger model used once.

For agents

Use in an agent

Same math, same result shape as the UI above — as a static ES module. No HTTP request, no auth, no rate limit.

import { compute } from "https://aifinhub.io/engines/agent-cost-envelope-calculator.js";

Contract: /contracts/agent-cost-envelope-calculator.json Full agent guide →

Glossary references

Terms used by this tool

All glossary →

Questions people ask next

FAQ

What's an agent cost envelope?

The full per-task cost band: minimum (cheapest model + cache hit + no retries) to maximum (best model + no cache + max retries). The envelope shows worst-case spend, not just the average. It matters because agent cost distributions have heavy right tails — one runaway loop can double the monthly bill.

How do retry assumptions affect cost?

Each retry repeats the input cost (prompt + context) and adds a fresh output cost. At 3 retries with a long context, total cost can be 4× the single-call cost. The calculator surfaces both 'happy path' (zero retries) and 'pessimistic' (max retries) numbers — the latter is what to budget against.

Should I budget for the median or the 95th percentile?

If the envelope is volatile (heavy tail of failure modes), budget against the 95th percentile. If retries are rare and cheap, the median is fine. The calculator shows the gap so you can see how tight the distribution is. A 4× ratio between median and 95th means your worst-case is 4× normal — plan accordingly.

Does the tool include human-review cost?

No. It models LLM call cost only. Adding human-in-the-loop review can dominate cost for high-stakes agents — but the time/cost varies so much by team that the calculator stays focused on the LLM bill.

Why split prompt-caching savings into a separate field?

Prompt cache hit rate is a deployment characteristic, not an inherent property of the model. A high-hit-rate workload (1000 calls with the same system prompt) gets a 90% discount on the cached portion; a low-hit-rate workload gets no discount. Splitting the field forces you to think about the deployment shape.

Related deep dive

All articles →

Read further

Long-form context behind the tool output.

Used in

Decision workflows that use this tool

Goal-driven flows that bundle this tool with adjacent ones.

Plan Your Agent Stack
Estimate first-year cost for an LLM agent — token budget, vendor selection, MCP servers.
Open

Complementary tools

Token-Cost Optimizer

Compute the dollar cost of a trading research loop across Claude, GPT, and Gemini. Prompt length × model × retry × call volume → cost per idea and per.

Calculators Open

Batch vs Real-Time Cost Calculator

Jobs per day, tokens per job, model, deadline — get real-time vs batch cost side-by-side with savings estimate and batch-eligibility flag. Based.