Agent-Cost Envelope
Agent-cost envelope = sum over (model_call_count × tokens_per_call × price_per_token) for a complete decision cycle. Includes prompt tokens, completion tokens, thinking-token charges (where applicable), and retry inflation when calls fail. The envelope is bounded by an explicit cap; without one, an agent in a degenerate loop can burn its monthly budget in hours.
On This Page
Definition
Agent-cost envelope
Agent-cost envelope = sum over (model_call_count × tokens_per_call × price_per_token) for a complete decision cycle. Includes prompt tokens, completion tokens, thinking-token charges (where applicable), and retry inflation when calls fail. The envelope is bounded by an explicit cap; without one, an agent in a degenerate loop can burn its monthly budget in hours.
Why it matters
LLM-driven trading agents are economically viable only when per-decision cost is below the per-decision edge. A research loop that costs $0.50 per query and produces a 5bps edge on a $100k position generates $5/decision — fine. The same loop on a $10k position generates $0.50 — break-even before slippage. Without an explicit cost envelope, agents run themselves into negative territory silently.
How it works
Map every model call in the agent loop. Estimate tokens per call (prompt + completion + thinking). Multiply by per-token price for each model used. Multiply by expected retry rate (typically 1.05-1.15x for production systems). Sum across the loop. Apply a hard cap at decision time — when reached, bail out with the partial answer rather than continuing to spend.
Example
Multi-step trading research agent, GPT-5 + Claude 4.6
Steps per decision
8
Avg tokens per step (in+out)
4,500
Blended price
$0.0045 / 1k tokens
Retry inflation
1.10x
Cost per decision
8 × 4.5 × 0.0045 × 1.10 = $0.18
$0.18 per decision is fine on positions over a few hundred dollars; ruinous on micro-positions. Set the envelope, log every decision, alert on breach.
Key Takeaways
Always set an explicit per-decision cost cap before deploying.
Log token usage per call — cost forensics after a runaway are otherwise impossible.
Retry budgets compound: a 10% retry rate over an 8-step loop is a 1.83x worst-case cost multiplier.
Related Terms
Try These Tools
Run the numbers next
Agent Cost Envelope Calculator
Model an LLM research loop end-to-end — steps, tool calls, convergence checks, markets per day — and see per-loop, daily, and monthly cost with cost-cap.
Token-Cost Optimizer
Compute the dollar cost of a trading research loop across Claude, GPT, and Gemini. Prompt length × model × retry × call volume → cost per idea and per.
Batch vs Real-Time Cost Calculator
Jobs per day, tokens per job, model, deadline — get real-time vs batch cost side-by-side with savings estimate and batch-eligibility flag. Based.
FAQ
Questions people ask next
The short answers readers usually want after the first pass.
Sources & References
- Pricing — Models — Anthropic
- Pricing — API — OpenAI
Related Content
Keep the topic connected
Hallucination Detection
Detecting LLM hallucinations in financial outputs: the verifiable-claim approach, citation grounding, and cross-model agreement signals that work.
Model Drift
Model drift: when an LLM's behavior changes between calls, versions, or weeks. The monitoring stack that catches it before production breaks.