Calculator
Agent Cost Envelope Calculator
Agent cost envelope for LLM research loops — model, tokens per step, tool-use steps, convergence check, markets per day → per-loop, daily, monthly cost.
Transparent by design — computed in your browser from a published formula and sourced rates, not a black box. Data verified May 25, 2026. Sources: Anthropic pricing ↗ · OpenAI pricing ↗ · Google AI / Gemini pricing ↗ Full methodology →
- Inputs
- Form inputs / CSV
- Runtime
- Instant
- Privacy
- Client-side · no upload
- API key
- Not required
- Methodology
- Open →
1 · Configure your agent loop
Monthly cost
$52/mo
Claude Sonnet 4.6 · 10 markets/day · 22 days · inside $200 target
Per loop: $0.237 · Per day: $2.37 · Utilisation: 26%
Cap recommendation
Per-loop budget
$0.909
$200 / month ÷ 220 loops
Max tokens per loop
185.7K
Current: 48.5K
Suggested cap per step
36.4K
You have headroom
2 · Per-loop breakdown
| Step | Input cost | Output cost | Total |
|---|---|---|---|
| Step 1 — tool call + reasoning | $0.024 | $0.022 | $0.046 |
| Step 2 — tool call + reasoning | $0.024 | $0.022 | $0.046 |
| Step 3 — tool call + reasoning | $0.024 | $0.022 | $0.046 |
| Step 4 — tool call + reasoning | $0.024 | $0.022 | $0.046 |
| Step 5 — tool call + reasoning | $0.024 | $0.022 | $0.046 |
| Convergence check — final analysisconvergence | $0.00240 | $0.00225 | $0.00465 |
| Per-loop total | $0.237 |
3 · Sensitivity — swap model tier
Same inputs, every model. Shows what the envelope becomes if you move tier — and which of the 8 fits your target budget.
| Model | Tier | Per loop | Per day | Per month | vs target |
|---|---|---|---|---|---|
| Gemini 2.5 Flash-Litegoogle | economy | $0.00714 | $0.071 | $2 | inside |
| Gemini 2.5 Flashgoogle | economy | $0.031 | $0.314 | $7 | inside |
| GPT-5.4 miniopenai | mid | $0.065 | $0.650 | $14 | inside |
| Claude Haiku 4.5anthropic | economy | $0.079 | $0.790 | $17 | inside |
| Gemini 2.5 Progoogle | frontier | $0.128 | $1.27 | $28 | inside |
| Gemini 3.5 Flashgoogle | frontier | $0.130 | $1.30 | $29 | inside |
| o4-mini (reasoning)openai | mid | $0.214 | $2.14 | $47 | inside |
| Claude Sonnet 4.6anthropicprimary | mid | $0.237 | $2.37 | $52 | inside |
| Claude Opus 4.8anthropic | frontier | $0.395 | $3.95 | $87 | inside |
| GPT-5.5openai | frontier | $0.433 | $4.33 | $95 | inside |
How the envelope is priced
step_cost = input_tokens × in_rate + output_tokens × out_rate convergence_cost = step_cost × convergence_pct loop_cost = steps × step_cost + convergence_cost daily_cost = loop_cost × markets_per_day monthly_cost = daily_cost × (22 biz | 30 crypto) budget_per_loop = target_monthly / (markets_per_day × days_per_month) max_tokens_per_loop = budget_per_loop / blended_$_per_token
Pricing verified 2026-04-23. See methodology for the full rate table, calendar-mode rationale, and limitations.
How to use
Step-by-step
- 1
Enter your model selection, prompt length, output length, and expected call volume per task.
- 2
Set retry assumptions: max retries on error, max retries on timeout, and the cache hit rate for the prompt prefix.
- 3
Read the cost envelope: minimum (best case), median (typical), and 95th percentile (pessimistic). Budget against the 95th percentile if your workflow has heavy-tailed failure modes.
- 4
Toggle 'fallback chain enabled' to model what happens when the primary model fails. Fallback adds reliability at a multiplied cost.
- 5
Compare envelopes across model choices. The cheapest model isn't always the cheapest envelope — frequent retries on a weak model can exceed the cost of a stronger model used once.
For agents
Use in an agent
Same math, same result shape as the UI above — as a static ES module. No HTTP request, no auth, no rate limit.
import { compute } from "https://aifinhub.io/engines/agent-cost-envelope-calculator.js"; Contract: /contracts/agent-cost-envelope-calculator.json Full agent guide →
Glossary references
Terms used by this tool
Questions people ask next
FAQ
What's an agent cost envelope?
The full per-task cost band: minimum (cheapest model + cache hit + no retries) to maximum (best model + no cache + max retries). The envelope shows worst-case spend, not just the average. It matters because agent cost distributions have heavy right tails — one runaway loop can double the monthly bill.
How do retry assumptions affect cost?
Each retry repeats the input cost (prompt + context) and adds a fresh output cost. At 3 retries with a long context, total cost can be 4× the single-call cost. The calculator surfaces both 'happy path' (zero retries) and 'pessimistic' (max retries) numbers — the latter is what to budget against.
Should I budget for the median or the 95th percentile?
If the envelope is volatile (heavy tail of failure modes), budget against the 95th percentile. If retries are rare and cheap, the median is fine. The calculator shows the gap so you can see how tight the distribution is. A 4× ratio between median and 95th means your worst-case is 4× normal — plan accordingly.
Does the tool include human-review cost?
No. It models LLM call cost only. Adding human-in-the-loop review can dominate cost for high-stakes agents — but the time/cost varies so much by team that the calculator stays focused on the LLM bill.
Why split prompt-caching savings into a separate field?
Prompt cache hit rate is a deployment characteristic, not an inherent property of the model. A high-hit-rate workload (1000 calls with the same system prompt) gets a 90% discount on the cached portion; a low-hit-rate workload gets no discount. Splitting the field forces you to think about the deployment shape.
Related deep dive
All articles →Read further
Long-form context behind the tool output.
- Tutorial · Runnable·11 min
Observability Patterns for LLM Trading Agents
Three patterns that stop silent failure: trace-ID propagation, structured log schema with per-step cost and confidence, and a deterministic replay harness.
Read - Methodology · Opinion·10 min
Bounded-Cost Agentic Research
Three gates stop runaway agent loops: hard token budget, step-count cap, and a cost-convergence check that halts when belief stops moving.
Read - Tutorial · Runnable·11 min
Agent Memory Patterns for Finance Research
Three memory tiers for finance agents — working, episodic, long-term lesson library — with retention policies and runnable Python for each.
Read
Used in
Decision workflows that use this tool
Goal-driven flows that bundle this tool with adjacent ones.
Complementary tools
Users of this tool often explore
Token-Cost Optimizer
Compute the dollar cost of a trading research loop across Claude, GPT, and Gemini. Prompt length × model × retry × call volume → cost per idea and per.
Batch vs Real-Time Cost Calculator
Jobs per day, tokens per job, model, deadline — get real-time vs batch cost side-by-side with savings estimate and batch-eligibility flag. Based.
Fallback Chain Simulator
Define a provider fallback chain, simulate rate-limit and latency failures, and see p50/p95/p99 latency, success rate, total cost, and degradation-event.