Methodology: Agent Cost Envelope Calculator

What the tool computes

Agentic research loops feel cheap per call and expensive per month. The trap is simple: every market you analyse is a loop, every loop is N tool-use steps, and every step is a full-context round trip through a frontier model. The Agent Cost Envelope Calculator gives you the monthly ceiling before you ship an agent into production, and back-solves the max-tokens-per-loop cap that keeps you inside a target budget.

This pattern — budgeting the envelope first, then capping token usage to match — is the bounded-cost agentic research playbook. Build it before you press go, not after your first $2,400 bill.

Inputs and assumptions

Primary model — one of eight (Claude Opus 4.7, Sonnet 4.6, Haiku 4.5, GPT-5, GPT-5 mini, o4-mini, Gemini 2.5 Pro, Gemini 2.5 Flash). Prices sourced from each vendor's public pricing page.
Input tokens per step — the full prompt that hits the model each step, including tool-call arguments, retrieved context, and system prompts.
Output tokens per step — model-generated text, including tool-call invocations.
Steps per loop — how many tool-use iterations the agent does before it decides to stop. In practice this is bounded by a max_steps guard.
Convergence-check cost — a final self-check / synthesis step priced as a percentage of a normal step. Default 10% assumes the convergence step uses the same context but a short summary-style output.
Markets analyzed per day — distinct ideas, tickers, or signals the agent is run against per day.
Target monthly budget — the ceiling you want to stay under. Drives the cap recommendation.
Calendar mode — 22 business days for equities / futures research, 30 calendar days for crypto or always-on workloads. Matters more than most people think (see below).

Formulas

step_cost           = input_tokens × in_rate + output_tokens × out_rate
convergence_cost    = step_cost × (convergence_pct / 100)
loop_cost           = steps × step_cost + convergence_cost
daily_cost          = loop_cost × markets_per_day
monthly_cost        = daily_cost × days_per_month
budget_per_loop     = target_monthly / (markets_per_day × days_per_month)
max_tokens_per_loop = budget_per_loop / blended_$_per_token
max_tokens_per_step = max_tokens_per_loop / (steps + convergence_pct/100)

The blended $/token uses the user's current input-to-output ratio. Output-heavy agents (big synthesised reports) get a tighter token cap than input-heavy agents (lots of context, short answer).

Pricing rate table (2026-04-23, USD per 1M tokens)

Model	Input	Output	Cache read
Claude Opus 4.7	$15	$75	$1.50
Claude Sonnet 4.6	$3	$15	$0.30
Claude Haiku 4.5	$1	$5	$0.10
GPT-5	$10	$40	—
GPT-5 mini	$2	$8	—
o4-mini	$3	$12	—
Gemini 2.5 Pro	$1.25	$10	—
Gemini 2.5 Flash	$0.30	$2.50	—

How the convergence check enters the math

Most agent scaffolds do a final "check your work" step — one more model call that takes the accumulated context and either signs off or requests another loop. That step reads a large input but emits a small output, so pricing it as a full step overstates the envelope.

The calculator prices the convergence check as a fraction of a normal step (default 10%). Tune it higher for agents that re-summarise everything with long outputs, lower for agents that just emit a boolean { "done": true }.

Business-day vs calendar-day default

Equities and futures research agents typically run only on trading days — roughly 22 business days / month. Crypto agents, on-chain monitors, and 24/7 inference pipelines run on all 30 calendar days. The difference is large: the same daily spend costs ~36% more on the crypto clock.

For this reason the tool defaults to business (equities framing) but exposes a one-click switch to crypto. Pick the one that matches your deployment cadence, not the one that looks cheaper.

Pricing sources

Anthropic API pricing (docs.anthropic.com/en/docs/about-claude/pricing)
OpenAI API pricing
Google AI / Gemini pricing

Limitations

Planning tool, not investment advice. The calculator computes a budget envelope from user inputs. It does not recommend a model, a strategy, or a trade.
Direct-API pricing only. Batch-API discounts (Anthropic 50%, OpenAI 50%), enterprise rates, and cache-read savings are not modelled here — see the Token-Cost Optimizer for a cache-aware view.
Deterministic tokens. Real agent steps have token variance; use the mean of a representative run.
No multimodal pricing. Image inputs, vision tokens, and audio are not modelled.
Tool-use side costs. Upstream tool APIs (data vendor, brokerage, retrieval) have their own costs — see inference cost attribution per trade.
Published rates as of 2026-04-23. Vendors change pricing often. Re-verify before locking a production budget.

Bounded-cost agentic research — the pattern this tool operationalises.
Inference cost attribution per trade — attributing the envelope downstream of the agent.
Token cost reality in LLM trading research — why back-of-the-envelope numbers are usually wrong by 3-10×.

Changelog

2026-04-23 — Initial release with 8 models, convergence-check sizing, and business/crypto calendar toggle.

How Agent Cost Envelope Calculator works