AI in Markets Formula

Cost Per 1K Tokens Formula

The cost of an LLM call is the input token count times the input price plus the output token count times the output price, with prices quoted per thousand (or per million) tokens. Because output tokens are typically priced several times higher than input tokens, controlling generation length usually matters more for the bill than trimming the prompt.

4 VARIABLESPublished May 26, 2026Live Content

By AI Fin Hub Research · AI Fin Hub Team

Best Next MoveCalculators

Token-Cost Optimizer

Compute the dollar cost of a trading research loop across Claude, GPT, and Gemini. Prompt length × model × retry × call volume → cost per idea and per.

CalculatorOpen ->

On This Page

Formula 4 variables Worked example Variations

Formula

Copy the exact expression or work through it step by step below.

 Cost = (InputTokens / 1000) x P_in + (OutputTokens / 1000) x P_out

where P_in, P_out are prices per 1K tokens

Variables

InputTokens

Input (prompt) tokens

Tokens sent to the model: system prompt, context, documents, and the user message. For finance workloads (filings, transcripts) this is usually the larger count, but the cheaper per-token rate.

OutputTokens

Output (completion) tokens

Tokens the model generates. Output is priced higher than input on most providers, often 3 to 5 times, so a verbose response can dominate the cost even when the prompt is large.

P_in

Input price per 1K tokens

Provider's published input rate. Quoted per million tokens by most vendors; divide by 1000 to get the per-1K rate used here.

P_out

Output price per 1K tokens

Provider's published output rate, applied to generated tokens. The input/output price gap is the single most important lever in cost engineering.

Step By Step

1

Count or estimate input and output tokens for a representative call.

A filing-summary call uses 8,000 input tokens and produces 1,200 output tokens.
2

Convert provider per-million prices to per-1K by dividing by 1000.

Input 5.00 per million is 0.005 per 1K; output 25.00 per million is 0.025 per 1K.
3

Multiply each token count (in thousands) by its price and add.

(8000/1000) x 0.005 + (1200/1000) x 0.025 = 8 x 0.005 + 1.2 x 0.025.
4

Sum for the per-call cost, then multiply by call volume for the workload cost.

0.040 + 0.030 = 0.070 per call; 10,000 calls per day is 700 per day.

Worked Example

Cost of one SEC-filing summarization call at flagship pricing

Input tokens

8,000

Output tokens

1,200

Input / output price (per 1M)

5.00 / 25.00

Per-1K prices: P_in = 5.00/1000 = 0.005, P_out = 25.00/1000 = 0.025. Input cost = (8000/1000) x 0.005 = 8 x 0.005 = 0.040. Output cost = (1200/1000) x 0.025 = 1.2 x 0.025 = 0.030. Total = 0.040 + 0.030 = 0.070.

About 0.07 per call. Note that the 1,200 output tokens cost almost as much (0.030) as the 8,000 input tokens (0.040), because output is priced 5x higher. Cutting the summary to 600 tokens would save 0.015 per call, far more than trimming a thousand input tokens. At 10,000 calls a day, that is the difference between 700 and 550 daily.

Common Variations

Cached-input pricing: providers charge a fraction (often 10 to 25%) for repeated prompt prefixes, which the prompt-cache break-even formula evaluates.

Batch pricing: asynchronous batch APIs commonly halve token prices for latency-tolerant workloads.

Per-million convention: many vendors quote per million tokens; the formula is identical with 1,000,000 in the denominator.

Try These Tools

Run the numbers next

CalculatorsCalculator

Financial Document Token Estimator

Paste a 10-K, 10-Q, 8-K or earnings transcript and see token count + one-pass extraction cost across ten frontier LLMs, with cache-hit toggle.

Launch toolOpen ->

CalculatorsCalculator

Batch vs Real-Time Cost Calculator

Jobs per day, tokens per job, model, deadline — get real-time vs batch cost side-by-side with savings estimate and batch-eligibility flag. Based.

Launch toolOpen ->

Sources & References

OpenAI API Pricing — OpenAI
Anthropic Claude Pricing — Anthropic

Keep the topic connected

AI in Markets4 VARIABLES

Prompt Cache Break-Even Formula

The prompt-cache break-even formula: how many reuses of a cached prefix repay its write premium. When caching pays off for LLM workloads.

Keep readingRead ->

AI in Markets4 VARIABLES

Cost Per Validated Trade Formula

The cost-per-validated-trade formula: total LLM spend over trades that pass validation. The real unit economics of an AI trading agent.

Keep readingRead ->

AI in Markets1 FAQS

Agent-Cost Envelope

The agent-cost envelope: the loop of (calls × tokens × retries × model_price) that determines the dollar cost of an LLM-driven trading agent per decision.

Keep readingRead ->

AI in Markets2 FAQS

MCP (Model Context Protocol)

Model Context Protocol: Anthropic's open standard for letting LLMs discover and call tools — the interface, why it matters, and finance MCP server checks.

Keep readingRead ->