Skip to main content
aifinhub

Calculator

Agent Cost Envelope Calculator

Agent cost envelope for LLM research loops — model, tokens per step, tool-use steps, convergence check, markets per day → per-loop, daily, monthly cost.

Transparent by design — computed in your browser from a published formula and sourced rates, not a black box. Data verified May 25, 2026. Sources: Anthropic pricing ↗ · OpenAI pricing ↗ · Google AI / Gemini pricing ↗ Full methodology →

Inputs
Form inputs / CSV
Runtime
Instant
Privacy
Client-side · no upload
API key
Not required
Methodology
Open →

Education · Not investment advice. BaFin/EU framework. Past performance does not indicate future results. Editorial standards Sponsor disclosure Corrections

1 · Configure your agent loop

10% of a step
Calendar mode

Monthly cost

$52/mo

Claude Sonnet 4.6 · 10 markets/day · 22 days · inside $200 target

Per loop: $0.237  ·  Per day: $2.37  ·  Utilisation: 26%

Cap recommendation

Per-loop budget

$0.909

$200 / month ÷ 220 loops

Max tokens per loop

185.7K

Current: 48.5K

Suggested cap per step

36.4K

You have headroom

2 · Per-loop breakdown

StepInput costOutput costTotal
Step 1 — tool call + reasoning$0.024$0.022$0.046
Step 2 — tool call + reasoning$0.024$0.022$0.046
Step 3 — tool call + reasoning$0.024$0.022$0.046
Step 4 — tool call + reasoning$0.024$0.022$0.046
Step 5 — tool call + reasoning$0.024$0.022$0.046
Convergence check — final analysisconvergence$0.00240$0.00225$0.00465
Per-loop total$0.237

3 · Sensitivity — swap model tier

Same inputs, every model. Shows what the envelope becomes if you move tier — and which of the 8 fits your target budget.

ModelTierPer loopPer dayPer monthvs target
Gemini 2.5 Flash-Litegoogleeconomy$0.00714$0.071$2inside
Gemini 2.5 Flashgoogleeconomy$0.031$0.314$7inside
GPT-5.4 miniopenaimid$0.065$0.650$14inside
Claude Haiku 4.5anthropiceconomy$0.079$0.790$17inside
Gemini 2.5 Progooglefrontier$0.128$1.27$28inside
Gemini 3.5 Flashgooglefrontier$0.130$1.30$29inside
o4-mini (reasoning)openaimid$0.214$2.14$47inside
Claude Sonnet 4.6anthropicprimarymid$0.237$2.37$52inside
Claude Opus 4.8anthropicfrontier$0.395$3.95$87inside
GPT-5.5openaifrontier$0.433$4.33$95inside

How the envelope is priced

step_cost          = input_tokens × in_rate + output_tokens × out_rate
convergence_cost   = step_cost × convergence_pct
loop_cost          = steps × step_cost + convergence_cost
daily_cost         = loop_cost × markets_per_day
monthly_cost       = daily_cost × (22 biz | 30 crypto)
budget_per_loop    = target_monthly / (markets_per_day × days_per_month)
max_tokens_per_loop = budget_per_loop / blended_$_per_token

Pricing verified 2026-04-23. See methodology for the full rate table, calendar-mode rationale, and limitations.

How to use

Step-by-step

Full calculator guide →
  1. 1

    Enter your model selection, prompt length, output length, and expected call volume per task.

  2. 2

    Set retry assumptions: max retries on error, max retries on timeout, and the cache hit rate for the prompt prefix.

  3. 3

    Read the cost envelope: minimum (best case), median (typical), and 95th percentile (pessimistic). Budget against the 95th percentile if your workflow has heavy-tailed failure modes.

  4. 4

    Toggle 'fallback chain enabled' to model what happens when the primary model fails. Fallback adds reliability at a multiplied cost.

  5. 5

    Compare envelopes across model choices. The cheapest model isn't always the cheapest envelope — frequent retries on a weak model can exceed the cost of a stronger model used once.

For agents

Use in an agent

Same math, same result shape as the UI above — as a static ES module. No HTTP request, no auth, no rate limit.

import { compute } from "https://aifinhub.io/engines/agent-cost-envelope-calculator.js";

Contract: /contracts/agent-cost-envelope-calculator.json Full agent guide →

Glossary references

Terms used by this tool

All glossary →

Questions people ask next

FAQ

What's an agent cost envelope?

The full per-task cost band: minimum (cheapest model + cache hit + no retries) to maximum (best model + no cache + max retries). The envelope shows worst-case spend, not just the average. It matters because agent cost distributions have heavy right tails — one runaway loop can double the monthly bill.

How do retry assumptions affect cost?

Each retry repeats the input cost (prompt + context) and adds a fresh output cost. At 3 retries with a long context, total cost can be 4× the single-call cost. The calculator surfaces both 'happy path' (zero retries) and 'pessimistic' (max retries) numbers — the latter is what to budget against.

Should I budget for the median or the 95th percentile?

If the envelope is volatile (heavy tail of failure modes), budget against the 95th percentile. If retries are rare and cheap, the median is fine. The calculator shows the gap so you can see how tight the distribution is. A 4× ratio between median and 95th means your worst-case is 4× normal — plan accordingly.

Does the tool include human-review cost?

No. It models LLM call cost only. Adding human-in-the-loop review can dominate cost for high-stakes agents — but the time/cost varies so much by team that the calculator stays focused on the LLM bill.

Why split prompt-caching savings into a separate field?

Prompt cache hit rate is a deployment characteristic, not an inherent property of the model. A high-hit-rate workload (1000 calls with the same system prompt) gets a 90% discount on the cached portion; a low-hit-rate workload gets no discount. Splitting the field forces you to think about the deployment shape.

Complementary tools

Planning estimates only — not financial, tax, or investment advice.