Methodology · Tool · Last updated 2026-04-23
How Financial Document Token Estimator works
How the Financial Document Token Estimator prices 10-K, 10-Q, 8-K, and earnings-call extractions across eight frontier LLMs.
What the tool computes
Given either a pasted filing body or a representative archetype, the tool estimates the input-token count for each of eight frontier models, the output-token cost for a planned extraction, and the total dollar cost both for a single one-pass run and for a synthesis against N peer filings. It also flags whether the resulting context fits inside each model’s published window.
Everything runs client-side in your browser. Nothing is uploaded. There is no API call, no server, no telemetry on your pasted text.
Char-per-token ratios
Tokenization is probabilistic and model-family specific. The tool uses published rough ratios per provider family to convert character length into a token estimate:
- Anthropic (Claude): ~3.5 characters per token on English prose.
- OpenAI (GPT family): ~4.0 characters per token on English prose (tiktoken baseline).
- Google (Gemini): ~4.0 characters per token as a practical approximation.
These are averages. A 10-K heavy in tabular numeric data will typically tokenize at a lower char/token ratio (i.e., more tokens per character) because numbers and symbols fragment more aggressively. The calculator is deliberately a planning tool, not a precise counter.
Archetype assumptions
Archetype token counts are mid-range estimates sampled from public EDGAR filings and investor-relations transcripts. Actual lengths vary widely with issuer size and complexity; large-cap 10-Ks can easily exceed 40K tokens in the narrative sections alone.
| Archetype | ~tokens | ~chars | Reasoning |
|---|---|---|---|
| 10-K annual report (body) | 18,000 | 72,000 | Business + MD&A + risk-factors prose, excluding exhibits. |
| 10-Q quarterly report (body) | 9,000 | 36,000 | MD&A + condensed statements body. |
| 8-K current report | 2,500 | 10,000 | Single-event disclosure; typically short. |
| Earnings call transcript | 35,000 | 140,000 | Prepared remarks + Q&A for a large-cap quarterly call. |
Cost formula
input_tokens = chars / chars_per_token (per provider)
cached_tokens = input_tokens × cache_hit_rate
fresh_tokens = input_tokens − cached_tokens
input_cost = fresh_tokens / 1e6 × input_rate
+ cached_tokens / 1e6 × input_rate × cache_read_multiplier
output_cost = output_tokens / 1e6 × output_rate
one_pass_cost = input_cost + output_cost
synthesis_cost = cost_for(input_tokens × (1 + peers), output_tokens)
fits_in_context = (synthesis_input_tokens + output_tokens) ≤ context_window Pricing rate table (2026-04-23, USD per 1M tokens)
| Model | Input | Output | Cache read mult. | Context |
|---|---|---|---|---|
| Claude Haiku 4.5 | $1 | $5 | 0.10× | 200K |
| Claude Sonnet 4.6 | $3 | $15 | 0.10× | 500K |
| Claude Opus 4.6 | $15 | $75 | 0.10× | 500K |
| Claude Opus 4.7 | $15 | $75 | 0.10× | 1M |
| GPT-5 | $10 | $40 | 0.50× | 400K |
| GPT-5 mini | $2 | $8 | 0.50× | 256K |
| Gemini 2.5 Flash | $0.30 | $2.50 | 0.25× | 1M |
| Gemini 2.5 Pro | $1.25 | $10 | 0.25× | 2M |
Pricing sources
- Anthropic API pricing
- OpenAI API pricing
- Google AI / Gemini pricing
- Anthropic docs — count_tokens endpoint for precise counts
Limitations
- Tokenization is probabilistic. Use tiktoken (OpenAI) or Anthropic’s
count_tokensendpoint for audit-grade numbers. - Archetypes are representative mid-range samples; real filings from the same category span an order of magnitude.
- Cache-read multipliers for OpenAI and Gemini are approximate — verify against the current published tier before material decisions.
- Exhibits, tables, and image-based PDFs are not modelled. OCR’d tables in particular tokenize worse than narrative prose.
- No batch-API discounts, no enterprise tier pricing, no multi-modal surcharges.
- This is a planning tool, not investment advice, and not a substitute for the vendor’s official tokenizer or billing records.
Related articles
- Prompt-caching economics for finance LLM pipelines — why cache-hit rate dominates long-context cost.
- Token-cost reality for LLM trading research — how input size and retries compound into real monthly bills.
- Reading financial filings with LLMs in 2026 — extraction patterns, context-window trade-offs, and where caching pays back.
Changelog
- 2026-04-23 — Initial release with 8 models and 4 archetypes.