Skip to main content
aifinhub

Methodology · Tool · Last updated 2026-05-08

How Earnings-Call Summarization Cost works

Pricing snapshot, cost formula, and caveats for the Earnings-Call Summarization Cost Calculator.

Cost formula

fresh_input_tok  = transcript_tok · (1 − cache_hit_rate)
cached_input_tok = transcript_tok · cache_hit_rate

input_cost  = fresh_input_tok  · price_input  / 1e6
            + cached_input_tok · price_cache  / 1e6
output_cost = summary_tok · price_output / 1e6

cost_per_attempt          = input_cost + output_cost
cost_per_stock_per_quarter = cost_per_attempt · attempts
cost_per_stock_per_year    = cost_per_stock_per_quarter · 4
total_universe_per_year    = cost_per_stock_per_year · tickers

Pricing snapshot — as of 2026-04-25

Prices in USD per million tokens. Listed input/output (no batch discount). Cache read price is the discounted rate vendors charge after a cache write has been paid at full input rate.

  • Anthropic Claude Sonnet 4.5: input $3.00, output $15.00, cache read $0.30.
  • Anthropic Claude Opus 4.7: input $15.00, output $75.00, cache read $1.50.
  • Anthropic Claude Haiku 4.5: input $1.00, output $5.00, cache read $0.10.
  • OpenAI GPT-4o: input $2.50, output $10.00, cache read $1.25.
  • OpenAI GPT-4o mini: input $0.15, output $0.60, cache read $0.075.
  • Google Gemini 2.5 Pro: input $1.25, output $10.00, cache read $0.31.
  • Google Gemini 2.5 Flash: input $0.30, output $2.50, cache read $0.075.

Assumptions

  • Cache hit applies only to the system-prompt and few-shot prefix; the transcript itself is treated as cold (uncached) input.
  • Tokens are model-tokenizer agnostic — we report a single transcript-token figure. In practice tokenizers differ by 5–15% across vendors.
  • Output tokens are typical summary length; longer multi-section outputs scale linearly.

Limitations

  • Pricing changes. The snapshot date is the only thing on this page that's authoritative — verify with vendor pricing pages before committing.
  • Batch APIs (Anthropic, OpenAI) discount further, typically 50%. Add a second tool run with halved costs if you can defer to overnight.
  • Reasoning/thinking-tokens (o-series, extended thinking) are not modelled — they are an additional output billing line.

External resources

Planning estimates only — not financial, tax, or investment advice.