Skip to main content
aifinhub

Calculator

Earnings-Call Summarization Cost Calculator

Compute LLM cost per stock per quarter to summarize earnings transcripts across Sonnet, Opus, GPT-4o, Gemini 2.5 Pro/Flash. Cache-hit-rate aware.

Inputs
Form inputs / CSV
Runtime
Instant
Privacy
Client-side · no upload
API key
Not required
Methodology
Open →

Education · Not investment advice. BaFin/EU framework. Past performance does not indicate future results. Editorial standards Sponsor disclosure Corrections

Workload

Tickers / quarter50
Transcript tokens (input)15,000
Summary tokens (output)800
Cache hit rate50%
Attempts / transcript1

Primary model (hero)

Pricing snapshot as of 2026-04-25. List prices, no batch discount.

Cost per stock per quarter

$0.04

Claude Sonnet 4.5 · $0.15/stock/year · $7.35 for full 50-ticker universe per year.

Ranked by annual cost (full universe)

Model$/stock/qtr$/stock/yrUniverse/yr
OpenAI · GPT-4o mini$0.002$0.01$0.43
Google · Gemini 2.5 Flash$0.005$0.02$0.96
Anthropic · Claude Haiku 4.5$0.012$0.05$2.45
Google · Gemini 2.5 Pro$0.020$0.08$3.94
OpenAI · GPT-4o$0.036$0.14$7.22
Anthropic · Claude Sonnet 4.5$0.037$0.15$7.35
Anthropic · Claude Opus 4.7$0.184$0.73$36.75

Caveats

Estimates assume average tokens; actual transcripts vary 5×. Cache reads only apply when prompts share a stable prefix (system prompt + few-shot). Earnings transcripts themselves are cold input. See methodology.

How to use

Step-by-step

Full calculator guide →
  1. 1

    Enter average call duration in minutes (typical: 60), or upload sample transcripts to measure actual token counts.

  2. 2

    Pick model: Sonnet for budget efficiency, Opus for highest extraction accuracy, GPT-4 for tool-use chains.

  3. 3

    Set summary output length (typical: 500-1500 tokens for a structured summary).

  4. 4

    Read per-call cost and per-quarter cost (multiplied by your call volume). Compare across models.

  5. 5

    Toggle prompt caching if you have a stable extraction schema. 90% discount on cached tokens often shifts the cost ranking between models.

Glossary references

Terms used by this tool

All glossary →

Questions people ask next

FAQ

What's the per-call cost range?

Depends on transcript length and model. A 60-minute earnings call is roughly 9,000-12,000 tokens of transcript. At Claude Sonnet pricing (input $3/M, output $15/M), summarization with 1,000-token output costs ~$0.04. At Opus, ~$0.20. At GPT-4, ~$0.30. For 500 companies/quarter, full-Opus runs ~$100, Sonnet ~$20.

Why does long-context model selection matter?

Some calls run long (Q&A pushes 80+ minutes) — 15,000+ token transcripts. Models with 200K context (Claude, GPT-4) handle these in one pass. Models with 32K or 16K context need chunking, which increases cost (each chunk has its own prompt overhead) and degrades summary quality. The tool surfaces this tradeoff.

How does prompt caching help here?

If your summarization prompt is the same across calls (system prompt + extraction schema), caching the prompt saves 90% on the input cost of those tokens. For 500 calls/quarter with a 2,000-token system prompt, that's ~$30 saved on Sonnet, ~$200 on Opus. The tool models this directly.

Can I summarize calls in real-time?

Latency-wise, yes — 12K input → 1K output runs in 5-15 seconds across the major providers. The bottleneck isn't compute, it's transcript availability. Most calls have transcripts available within 1-3 hours. For real-time research, audio-to-text + immediate summarization is possible but adds Whisper-style transcription cost (~$0.06/call).

What if I need to extract structured data, not summary?

Structured extraction (e.g., guidance changes, KPI mentions) costs more than summarization because it requires longer outputs and stricter prompting. The tool has a 'structured extraction' mode that adjusts the output token estimate accordingly. For high-volume extraction at scale, fine-tuning a smaller model often beats the big-model API.

Complementary tools

Planning estimates only — not financial, tax, or investment advice.