Methodology · Tool · Last updated 2026-04-20
How Token-Cost Optimizer works
How the Token-Cost Optimizer prices LLM research loops.
Formulas
effective_call = input_tokens × price_in + output_tokens × price_out
(Anthropic: cache-hit fraction priced at cache_read)
calls_per_idea = calls × (1 + retry_rate)
cost_per_idea = effective_call × calls_per_idea
cost_per_day = cost_per_idea × ideas_per_day
cost_per_validated = cost_per_idea / validation_rate
cost_per_year = cost_per_day × 365 Pricing rate table (2026-05-25, USD per 1M tokens)
| Model | Input | Output | Cache read |
|---|---|---|---|
| Claude Opus 4.7 | $5 | $25 | $0.50 |
| Claude Sonnet 4.6 | $3 | $15 | $0.30 |
| Claude Haiku 4.5 | $1 | $5 | $0.10 |
| GPT-5.5 | $5 | $30 | — |
| GPT-5.4 mini | $0.75 | $4.50 | — |
| o4-mini | $3 | $12 | — |
| Gemini 2.5 Pro | $1.25 | $10 | — |
| Gemini 3.5 Flash | $1.50 | $9 | — |
| Gemini 2.5 Flash | $0.30 | $2.50 | — |
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | — |
Pricing sources
Assumptions + limitations
- Direct-API pricing only. Batch-API discounts (Anthropic 50%, OpenAI 50%) and enterprise rates are not modeled.
- Deterministic token counts. Real prompts have variance; use the average of a representative sample for input/output token counts.
- Cache hit rate applies only to Anthropic models with prompt caching. For other providers, that slider has no effect.
- No image or tool-use pricing. Multimodal inputs and tool-call round trips add tokens not counted in the calculator — add them to the input/output fields manually if material.
- Validation rate is a user estimate. Track it empirically for accuracy; the tool cannot infer it.
- Retry rate models transient failures + model re-prompts. Structured retries with larger context bumps are undercounted.
Changelog
- 2026-04-20 — Initial release with 8 models across Anthropic, OpenAI, Google.
- 2026-05-25 — Refreshed rates to current list prices: Claude Opus 4.7 to $5/$25 (cache read $0.50), GPT-5.5 to $5/$30, GPT-5.4 mini to $0.75/$4.50.
- 2026-05-25 — Added Gemini 3.5 Flash ($1.50/$9, a frontier agent-tier) and Gemini 2.5 Flash-Lite ($0.10/$0.40, the cheapest tier); table now tracks 10 models.