Cost Per 1K Tokens Formula
The cost of an LLM call is the input token count times the input price plus the output token count times the output price, with prices quoted per thousand (or per million) tokens. Because output tokens are typically priced several times higher than input tokens, controlling generation length usually matters more for the bill than trimming the prompt.
Formula
Copy the exact expression or work through it step by step below.
Cost = (InputTokens / 1000) x P_in + (OutputTokens / 1000) x P_out
where P_in, P_out are prices per 1K tokens Variables
InputTokens
Input (prompt) tokens
Tokens sent to the model: system prompt, context, documents, and the user message. For finance workloads (filings, transcripts) this is usually the larger count, but the cheaper per-token rate.
OutputTokens
Output (completion) tokens
Tokens the model generates. Output is priced higher than input on most providers, often 3 to 5 times, so a verbose response can dominate the cost even when the prompt is large.
P_in
Input price per 1K tokens
Provider's published input rate. Quoted per million tokens by most vendors; divide by 1000 to get the per-1K rate used here.
P_out
Output price per 1K tokens
Provider's published output rate, applied to generated tokens. The input/output price gap is the single most important lever in cost engineering.
Step By Step
- 1
Count or estimate input and output tokens for a representative call.
A filing-summary call uses 8,000 input tokens and produces 1,200 output tokens.
- 2
Convert provider per-million prices to per-1K by dividing by 1000.
Input 5.00 per million is 0.005 per 1K; output 25.00 per million is 0.025 per 1K.
- 3
Multiply each token count (in thousands) by its price and add.
(8000/1000) x 0.005 + (1200/1000) x 0.025 = 8 x 0.005 + 1.2 x 0.025.
- 4
Sum for the per-call cost, then multiply by call volume for the workload cost.
0.040 + 0.030 = 0.070 per call; 10,000 calls per day is 700 per day.
Worked Example
Cost of one SEC-filing summarization call at flagship pricing
Input tokens
8,000
Output tokens
1,200
Input / output price (per 1M)
5.00 / 25.00
Per-1K prices: P_in = 5.00/1000 = 0.005, P_out = 25.00/1000 = 0.025. Input cost = (8000/1000) x 0.005 = 8 x 0.005 = 0.040. Output cost = (1200/1000) x 0.025 = 1.2 x 0.025 = 0.030. Total = 0.040 + 0.030 = 0.070.
About 0.07 per call. Note that the 1,200 output tokens cost almost as much (0.030) as the 8,000 input tokens (0.040), because output is priced 5x higher. Cutting the summary to 600 tokens would save 0.015 per call, far more than trimming a thousand input tokens. At 10,000 calls a day, that is the difference between 700 and 550 daily.
Common Variations
Try These Tools
Run the numbers next
Financial Document Token Estimator
Paste a 10-K, 10-Q, 8-K or earnings transcript and see token count + one-pass extraction cost across eight frontier LLMs, with cache-hit toggle.
Batch vs Real-Time Cost Calculator
Jobs per day, tokens per job, model, deadline — get real-time vs batch cost side-by-side with savings estimate and batch-eligibility flag. Based.
Sources & References
- OpenAI API Pricing — OpenAI
- Anthropic Claude Pricing — Anthropic
Related Content
Keep the topic connected
Prompt Cache Break-Even Formula
The prompt-cache break-even formula: how many reuses of a cached prefix repay its write premium. When caching pays off for LLM workloads.
Cost Per Validated Trade Formula
The cost-per-validated-trade formula: total LLM spend over trades that pass validation. The real unit economics of an AI trading agent.
Agent-Cost Envelope
The agent-cost envelope: the loop of (calls × tokens × retries × model_price) that determines the dollar cost of an LLM-driven trading agent per decision.
MCP (Model Context Protocol)
Model Context Protocol: Anthropic's open standard for letting LLMs discover and call tools — the interface, why it matters, and finance MCP server checks.