The short answer
For low-cost finance research-agent loops in 2026, DeepSeek is the cheaper hosted API on verified numbers: DeepSeek V4-Flash is $0.14 input / $0.28 output per Mtok with a 1M-token window and default caching. Mistral's exact per-token rates are not cleanly published, so confirm them in your own console. The cost lever is tokens-per-step, not the vendor.
For low-cost finance research-agent loops in 2026, DeepSeek is the cheaper hosted API on verified numbers: DeepSeek V4-Flash is $0.14 / Mtok cache-miss input and $0.28 output, with a 1M-token context window and default automatic caching. Mistral's hosted API spans Large 3, Medium 3.5, Small 4, and the Ministral 3 series, but its exact per-token API rates are not cleanly published on the official pricing page (it renders client-side and shows consumer plans), so treat Mistral's per-token figures as the one input you must confirm in your own console. On a finance agent loop, the cost lever is tokens-per-step and steps-per-loop more than the vendor. Model the envelope with the Agent Cost Envelope Calculator.
TL;DR
| Vendor | Model | Input $/Mtok | Output $/Mtok | Context | Verification |
|---|---|---|---|---|---|
| DeepSeek | V4-Flash | $0.14 (miss) / $0.0028 (hit) | $0.28 | 1M | Official docs |
| DeepSeek | V4-Pro | $0.435 (hit-adj.) | $0.87 | 1M | Official docs (promo) |
| Mistral | Large 3 / Medium 3.5 / Small 4 / Ministral 3 | see note | see note | per model | Lineup verified; per-token rates not machine-readable on official page |
DeepSeek's numbers are verified end to end. Mistral's model lineup is verified; its per-token API rates are the one fact to confirm directly before you budget.
DeepSeek: verified, and cheap
DeepSeek's current hosted models are deepseek-v4-flash and deepseek-v4-pro, both with a 1M-token context window and 384k max output. The legacy aliases deepseek-chat and deepseek-reasoner now route to the non-thinking and thinking modes of V4-Flash respectively, and are scheduled for deprecation on 2026-07-24.1
Verified per-Mtok pricing:1
- V4-Flash cache-miss input $0.14, cache-hit input $0.0028, output $0.28.
- V4-Pro cache-miss input listed under a 75%-off promotion through 2026-05-31 (after which it adjusts to one quarter of the original price), output $0.87.
The cache-hit input rate is the standout: at $0.0028 / Mtok it is ~2% of the cache-miss rate, and automatic context caching is on by default. For a research agent that re-sends a large fixed instruction block every step, that boilerplate is nearly free after the first step.
Mistral: verified lineup, unverified per-token rates
Mistral's current hosted lineup, verified from the official model documentation, is Mistral Large 3 (open-weight general multimodal), Magistral Medium 1.2 (premier reasoning), Mistral Medium 3.5 (agentic/coding), Mistral Small 4, and the Ministral 3 series (14B / 8B / 3B).2
What is not cleanly verifiable from Mistral's official pricing page (as of May 2026) is the exact per-Mtok API rate per model: the pricing page renders client-side and surfaces consumer Le Chat plans rather than a machine-readable API rate card. Third-party trackers quote figures (and disagree among themselves), which is precisely why this article does not state a hard Mistral per-token number as verified. Confirm the live rate in the Mistral console or API pricing section before you budget against it.3 The verified facts: the lineup above, and per-token billing with cache-read discounts on most models.
Why the loop shape beats the vendor
For a finance research agent, the dominant cost driver is not the per-token rate but the loop:
- Tokens per step how much context each tool-call + reasoning step carries.
- Steps per loop how many iterations before convergence.
- Markets per day how many independent loops you run.
A 2x reduction in steps-per-loop or a disciplined context budget per step moves the bill more than swapping a $0.14 model for a $0.28 one. The cheapest vendor on a bloated loop loses to a disciplined loop on a mid-priced model.
Verified engine output
The block below runs the Agent Cost Envelope Calculator on a 150-market-per-day research loop (6 steps/loop, a convergence check, business-calendar cadence) using the engine's cheapest in-table frontier-class model as the reference floor. DeepSeek V4-Flash sits below that floor on a per-token basis, so the engine number is a conservative upper bound on what the same loop costs on DeepSeek: the real DeepSeek bill is lower. The output is computed live from the shipped bundle, not typed by hand; the DeepSeek and Mistral list prices in the table above are the verified inputs to do the substitution yourself.
Decision guidance
- Cheapest verified hosted API for a finance agent loop DeepSeek V4-Flash, especially with the default cache amortizing a fixed instruction prefix.
- Need a thinking/reasoning mode DeepSeek V4-Flash thinking mode (via the reasoner alias until deprecation) or V4-Pro; on Mistral, Magistral Medium.
- EU data-residency or open-weight self-host requirement Mistral's open-weight models are the relevant option; confirm the hosted per-token rate before assuming a price.
- Budget against either substitute the verified per-token rate into your own loop shape; the loop discipline matters more than the headline rate.
Connects to
- Agent Cost Envelope Calculator: per-loop economics for a research agent.
- Cheapest LLM for SEC Filings 2026: DeepSeek V4-Flash in the filing-extraction context.
- Claude vs GPT-5 vs Gemini for Financial Analysis 2026: when the loop needs a frontier reasoning tier.
- Best LLM for Financial Analysis 2026: the task-tiered pillar.
References
Footnotes
-
DeepSeek. "Models & Pricing." api-docs.deepseek.com, verified 2026-05-25. https://api-docs.deepseek.com/quick_start/pricing ↩ ↩2
-
Mistral AI. "Models Overview." docs.mistral.ai, verified 2026-05-25. https://docs.mistral.ai/getting-started/models/models_overview/ ↩
-
Mistral AI. "Pricing." mistral.ai, verified 2026-05-25 (page renders client-side; per-token API rates not machine-readable). https://mistral.ai/pricing ↩
Verified engine output
Show the recompute-verified inputs and outputs
| model_id | claude-haiku-4-5 |
|---|---|
| input_tokens_per_step | 4000 |
| output_tokens_per_step | 800 |
| steps_per_loop | 6 |
| convergence_check_pct | 50 |
| markets_per_day | 150 |
| target_monthly_usd | 500 |
| calendar_mode | business |
| model › id | claude-haiku-4-5 |
|---|---|
| model › provider | anthropic |
| model › name | Claude Haiku 4.5 |
| model › tier | economy |
| model › input usd per mtoken | 1 |
| model › output usd per mtoken | 5 |
| model › cache read usd per mtoken | 0.1 |
| model › context window | 200000 |
| model › notes | Cheap filtering / pre-processing. |
| steps (7 items) | [...] |
| cost per loop | 0.052000000000000005 |
| tool use subtotal | 0.048 |
| convergence cost | 0.004 |
| cost per day | 7.800000000000001 |
| cost per month | 171.60000000000002 |
| days per month | 22 |
| tokens per loop | 31200 |
| blended usd per1 ktokens | 0.0016666666666666668 |
| within budget | true |
| budget utilization | 0.34320000000000006 |
Computed live at build time.
Frequently asked questions
- Is DeepSeek or Mistral cheaper for financial analysis in 2026?
- DeepSeek is the cheaper verified hosted API — V4-Flash at $0.14/$0.28 per Mtok with a 1M context and near-free cache hits. Mistral's exact per-token API rates are not cleanly published on its official pricing page, so a precise comparison requires confirming Mistral's live rate.
- What is DeepSeek's context window?
- Both V4-Flash and V4-Pro carry a 1M-token context window with 384k max output.
- What happened to deepseek-chat and deepseek-reasoner?
- They are now legacy aliases routing to the non-thinking and thinking modes of V4-Flash, scheduled for deprecation on 2026-07-24.
- What are Mistral's current models?
- Mistral Large 3, Magistral Medium 1.2, Mistral Medium 3.5, Mistral Small 4, and the Ministral 3 series (14B/8B/3B), per the official model documentation.
- Does the vendor or the loop shape matter more for cost?
- The loop shape — tokens per step and steps per loop — usually moves the bill more than the per-token rate. A disciplined loop on a mid-priced model beats a bloated loop on the cheapest model.