Methodology · Tool · Last updated 2026-04-23
How Agent Cost Envelope Calculator works
How the Agent Cost Envelope Calculator bounds the dollar cost of an LLM research loop.
What the tool computes
Agentic research loops feel cheap per call and expensive per month. The trap is simple: every market you analyse is a loop, every loop is N tool-use steps, and every step is a full-context round trip through a frontier model. The Agent Cost Envelope Calculator gives you the monthly ceiling before you ship an agent into production, and back-solves the max-tokens-per-loop cap that keeps you inside a target budget.
This pattern — budgeting the envelope first, then capping token usage to match — is the bounded-cost agentic research playbook. Build it before you press go, not after your first $2,400 bill.
Inputs and assumptions
- Primary model — one of eight (Claude Opus 4.7, Sonnet 4.6, Haiku 4.5, GPT-5, GPT-5 mini, o4-mini, Gemini 2.5 Pro, Gemini 2.5 Flash). Prices sourced from each vendor's public pricing page.
- Input tokens per step — the full prompt that hits the model each step, including tool-call arguments, retrieved context, and system prompts.
- Output tokens per step — model-generated text, including tool-call invocations.
- Steps per loop — how many tool-use iterations the agent does before it decides to stop. In practice this is bounded by a
max_stepsguard. - Convergence-check cost — a final self-check / synthesis step priced as a percentage of a normal step. Default
10%assumes the convergence step uses the same context but a short summary-style output. - Markets analyzed per day — distinct ideas, tickers, or signals the agent is run against per day.
- Target monthly budget — the ceiling you want to stay under. Drives the cap recommendation.
- Calendar mode —
22 business daysfor equities / futures research,30 calendar daysfor crypto or always-on workloads. Matters more than most people think (see below).
Formulas
step_cost = input_tokens × in_rate + output_tokens × out_rate
convergence_cost = step_cost × (convergence_pct / 100)
loop_cost = steps × step_cost + convergence_cost
daily_cost = loop_cost × markets_per_day
monthly_cost = daily_cost × days_per_month
budget_per_loop = target_monthly / (markets_per_day × days_per_month)
max_tokens_per_loop = budget_per_loop / blended_$_per_token
max_tokens_per_step = max_tokens_per_loop / (steps + convergence_pct/100)
The blended $/token uses the user's current input-to-output ratio. Output-heavy agents (big synthesised reports) get a tighter token cap than input-heavy agents (lots of context, short answer).
Pricing rate table (2026-04-23, USD per 1M tokens)
| Model | Input | Output | Cache read |
|---|---|---|---|
| Claude Opus 4.7 | $15 | $75 | $1.50 |
| Claude Sonnet 4.6 | $3 | $15 | $0.30 |
| Claude Haiku 4.5 | $1 | $5 | $0.10 |
| GPT-5 | $10 | $40 | — |
| GPT-5 mini | $2 | $8 | — |
| o4-mini | $3 | $12 | — |
| Gemini 2.5 Pro | $1.25 | $10 | — |
| Gemini 2.5 Flash | $0.30 | $2.50 | — |
How the convergence check enters the math
Most agent scaffolds do a final "check your work" step — one more model call that takes the accumulated context and either signs off or requests another loop. That step reads a large input but emits a small output, so pricing it as a full step overstates the envelope.
The calculator prices the convergence check as a fraction of a normal step (default 10%). Tune it higher for agents that re-summarise everything with long outputs, lower for agents that just emit a boolean { "done": true }.
Business-day vs calendar-day default
Equities and futures research agents typically run only on trading days — roughly 22 business days / month. Crypto agents, on-chain monitors, and 24/7 inference pipelines run on all 30 calendar days. The difference is large: the same daily spend costs ~36% more on the crypto clock.
For this reason the tool defaults to business (equities framing) but exposes a one-click switch to crypto. Pick the one that matches your deployment cadence, not the one that looks cheaper.
Pricing sources
- Anthropic API pricing (docs.anthropic.com/en/docs/about-claude/pricing)
- OpenAI API pricing
- Google AI / Gemini pricing
Limitations
- Planning tool, not investment advice. The calculator computes a budget envelope from user inputs. It does not recommend a model, a strategy, or a trade.
- Direct-API pricing only. Batch-API discounts (Anthropic 50%, OpenAI 50%), enterprise rates, and cache-read savings are not modelled here — see the Token-Cost Optimizer for a cache-aware view.
- Deterministic tokens. Real agent steps have token variance; use the mean of a representative run.
- No multimodal pricing. Image inputs, vision tokens, and audio are not modelled.
- Tool-use side costs. Upstream tool APIs (data vendor, brokerage, retrieval) have their own costs — see inference cost attribution per trade.
- Published rates as of 2026-04-23. Vendors change pricing often. Re-verify before locking a production budget.
Related articles
- Bounded-cost agentic research — the pattern this tool operationalises.
- Inference cost attribution per trade — attributing the envelope downstream of the agent.
- Token cost reality in LLM trading research — why back-of-the-envelope numbers are usually wrong by 3-10×.
Changelog
- 2026-04-23 — Initial release with 8 models, convergence-check sizing, and business/crypto calendar toggle.