Methodology: Token-Cost Optimizer — AI Fin Hub Research

Formulas

effective_call      = input_tokens × price_in + output_tokens × price_out
                      (Anthropic: cache-hit fraction priced at cache_read)
calls_per_idea      = calls × (1 + retry_rate)
cost_per_idea       = effective_call × calls_per_idea
cost_per_day        = cost_per_idea × ideas_per_day
cost_per_validated  = cost_per_idea / validation_rate
cost_per_year       = cost_per_day × 365

Pricing rate table (2026-04-20, USD per 1M tokens)

Model	Input	Output	Cache read
Claude Opus 4.7	$15	$75	$1.50
Claude Sonnet 4.6	$3	$15	$0.30
Claude Haiku 4.5	$1	$5	$0.10
GPT-5	$10	$40	—
GPT-5 mini	$2	$8	—
o4-mini	$3	$12	—
Gemini 2.5 Pro	$1.25	$10	—
Gemini 2.5 Flash	$0.30	$2.50	—

Pricing sources

Assumptions + limitations

Direct-API pricing only. Batch-API discounts (Anthropic 50%, OpenAI 50%) and enterprise rates are not modeled.
Deterministic token counts. Real prompts have variance; use the average of a representative sample for input/output token counts.
Cache hit rate applies only to Anthropic models with prompt caching. For other providers, that slider has no effect.
No image or tool-use pricing. Multimodal inputs and tool-call round trips add tokens not counted in the calculator — add them to the input/output fields manually if material.
Validation rate is a user estimate. Track it empirically for accuracy; the tool cannot infer it.
Retry rate models transient failures + model re-prompts. Structured retries with larger context bumps are undercounted.

Changelog

2026-04-20 — Initial release with 8 models across Anthropic, OpenAI, Google.