The short answer

DeepSeek V4 is the cost floor for SEC filing work in 2026. V4-Flash lists at $0.14/$0.28 per Mtok with a 1M-token window and 384K max output, about $0.018 per 120k-token filing and roughly $179 across 10,000, falling toward $15 when the repeated schema prefix hits the cache. Pin the explicit V4 IDs before the legacy names retire 2026/07/24.

DeepSeek V4 is the cost floor for SEC filing work in 2026. V4-Flash lists at $0.14 per million input tokens and $0.28 per million output, with a 1M-token context window and 384K max output that holds a full 10-K without chunking. At a 120k-in / 4k-out filing shape that is about $0.018 per filing, roughly $179 across 10,000 filings, dropping toward $15 when the repeated schema prefix hits the cache. The catch is migration: the legacy deepseek-chat and deepseek-reasoner names retire 2026/07/24 15:59 UTC, so pin the explicit V4 IDs now. Price your own filing volume in the Token Cost Optimizer.

TL;DR

Model Input $/Mtok Output $/Mtok Context Max output ~$/filing (120k in + 4k out)
DeepSeek V4-Flash $0.14 $0.28 1M 384K $0.018
DeepSeek V4-Flash (cache-hit prefix) $0.0028 $0.28 1M 384K $0.0015
DeepSeek V4-Pro $0.435 $0.87 1M 384K $0.056

Per-filing costs are arithmetic from the verified list prices (120,000 input tokens × input rate + 4,000 output tokens × output rate), not a benchmark run. Prices verified 2026-06-16 against DeepSeek's official pricing page.1

Why V4 resets the filings price floor

SEC extraction is a long-input, high-volume, structured-output job: feed a 10-K or 10-Q, pull specific line items, repeat across thousands of filings. The bill is dominated by input tokens on long documents and output tokens on structured results, so per-token price and context fit decide it. V4-Flash pairs a 1M-token window with $0.14 / $0.28 rates, so a full filing fits in one call and the per-document cost lands near two cents.1 Earlier coverage on this site quoted V4-Pro under a 75% promotion; that promotion is gone, and the steadier number to budget against is the V4-Flash list rate.

The verified V4 prices

DeepSeek's pricing page lists two V4 models, both at a 1M context window and 384K max output.1

V4-Flash $0.14 / Mtok cache-miss input, $0.0028 / Mtok cache-hit input, $0.28 / Mtok output. This is the production default: cheaper, faster, sufficient for most extraction.

V4-Pro $0.435 / Mtok cache-miss input, $0.003625 / Mtok cache-hit input, $0.87 / Mtok output. The reasoning-heavier variant for harder fields, still well under frontier-tier rates.

Context caching is automatic on every request: when a prompt shares a prefix, the API bills the cache-hit rate.1 On V4-Flash that prefix drops from $0.14 to $0.0028 per Mtok, about 2% of the cache-miss rate.

The per-filing math at scale

A 10-K body commonly lands in the 100k–150k-token range. Pricing a 120k-input, 4k-output extraction on V4-Flash list rates gives about $0.018 per filing.1 Across 10,000 filings that is roughly $179. Filings repeat heavy boilerplate, so if you pin a fixed extraction schema and instruction block ahead of the filing text, the cached prefix bills at $0.0028 / Mtok and the same sweep falls toward $15. V4-Pro at $0.435 / $0.87 costs about $0.056 per filing, near $557 across 10,000, the price of reserving the heavier model for hard numeric fields.

What V4 does not change: the accuracy floor

Cheap tokens do not mean correct extractions. The FinanceBench study showed how hard open-book financial QA is for language models: on a 150-case sample, GPT-4-Turbo with a retrieval system incorrectly answered or refused 81% of questions, and the full benchmark runs to 10,231 questions over real filings.2 Newer models do better, but the lesson holds: a budget extractor that misreads a parenthetical "(loss)" as positive is expensive in errors. Pick the cheapest model that clears your accuracy bar on an eval of your own filings, not the cheapest model outright.

The migration deadline you cannot ignore

DeepSeek shipped V4-Pro and V4-Flash on 2026-04-24 and kept the legacy aliases alive for a 90-day window.3 The names deepseek-chat and deepseek-reasoner retire 2026/07/24 15:59 UTC; during the grace period both route transparently to V4-Flash modes, so production code is already on V4 whether or not it has been updated.13 After the deadline, requests using the old names fail. Pin deepseek-v4-flash or deepseek-v4-pro explicitly, test thinking and non-thinking modes, and keep a fallback path before the cutoff.

Decision guidance

  • Absolute cheapest full-filing fit: V4-Flash (1M context, ~$0.018/filing). Confirm data-handling terms suit your use.
  • Repeated schema across thousands of filings: turn the boilerplate into a cached prefix; the sweep cost drops toward $15 per 10,000.
  • Harder numeric fields: route the difficult subset to V4-Pro and keep V4-Flash on the bulk.
  • High-stakes outputs: run an eval; a budget extractor feeding a frontier verifier may be the real cheapest-correct path.
  • Shipping before late July: migrate off deepseek-chat/deepseek-reasoner to the explicit V4 IDs now.

Connects to

References

Footnotes

  1. DeepSeek. "Models & Pricing." api-docs.deepseek.com, verified 2026-06-16. https://api-docs.deepseek.com/quick_start/pricing 2 3 4 5 6

  2. Islam, Pranab et al. "FinanceBench: A New Benchmark for Financial Question Answering." arXiv:2311.11944, accessed 2026-06-17. https://arxiv.org/abs/2311.11944

  3. DeepSeek. "Change Log." api-docs.deepseek.com/updates, verified 2026-06-16. https://api-docs.deepseek.com/updates 2

Frequently asked questions

How much does DeepSeek V4 cost to extract one SEC filing?
On V4-Flash list rates ($0.14 input / $0.28 output per Mtok), a 120k-input, 4k-output extraction is about $0.018 per filing, roughly $179 across 10,000 filings. Pinning a fixed schema prefix that hits the cache-hit rate ($0.0028 per Mtok) cuts the input cost to about 2% of cache-miss, pulling a 10,000-filing sweep toward $15. Verified against DeepSeek's pricing page 2026-06-17.
Does DeepSeek V4 fit a full 10-K in context?
Yes. Both V4-Flash and V4-Pro carry a 1M-token context window with 384K max output, so a full 10-K, whose body commonly lands in the 100k-150k-token range, fits in one call with room for instructions and few-shot examples. That removes the overlap and stitching bugs that come with forced chunking on smaller windows.
Is the cheap price the old promotional rate?
No. Earlier coverage quoted V4-Pro under a 75% promotion that has ended. The current verified list rates are V4-Flash $0.14/$0.28 and V4-Pro $0.435/$0.87 per Mtok, with no promotional discount shown on the pricing page as of 2026-06-17. Budget against the V4-Flash list rate as the steady cost floor.
What is the July 24 2026 migration deadline?
DeepSeek launched V4 on 2026-04-24 and kept the legacy aliases deepseek-chat and deepseek-reasoner working for a 90-day window. They retire 2026/07/24 15:59 UTC, after which requests using those names fail. Pin deepseek-v4-flash or deepseek-v4-pro explicitly, test thinking and non-thinking modes, and keep a fallback before the cutoff.
Is DeepSeek V4 accurate enough for filings?
Not automatically. FinanceBench found GPT-4-Turbo with retrieval incorrectly answered or refused 81% of a 150-case sample, showing how hard open-book financial QA is. Pick the cheapest model that clears your accuracy bar on an eval of your own filings, and consider a budget extractor feeding a frontier verifier for high-stakes numeric fields.