Skip to main content
aifinhub
AI in Markets Formula

Cost Per 1K Tokens Formula

The cost of an LLM call is the input token count times the input price plus the output token count times the output price, with prices quoted per thousand (or per million) tokens. Because output tokens are typically priced several times higher than input tokens, controlling generation length usually matters more for the bill than trimming the prompt.

By AI Fin Hub Research · AI Fin Hub Team
Best Next MoveCalculators

Token-Cost Optimizer

Compute the dollar cost of a trading research loop across Claude, GPT, and Gemini. Prompt length × model × retry × call volume → cost per idea and per.

CalculatorOpen ->

On This Page

Formula

Copy the exact expression or work through it step by step below.

Cost = (InputTokens / 1000) x P_in + (OutputTokens / 1000) x P_out where P_in, P_out are prices per 1K tokens

Variables

InputTokens

Input (prompt) tokens

Tokens sent to the model: system prompt, context, documents, and the user message. For finance workloads (filings, transcripts) this is usually the larger count, but the cheaper per-token rate.

OutputTokens

Output (completion) tokens

Tokens the model generates. Output is priced higher than input on most providers, often 3 to 5 times, so a verbose response can dominate the cost even when the prompt is large.

P_in

Input price per 1K tokens

Provider's published input rate. Quoted per million tokens by most vendors; divide by 1000 to get the per-1K rate used here.

P_out

Output price per 1K tokens

Provider's published output rate, applied to generated tokens. The input/output price gap is the single most important lever in cost engineering.

Step By Step

  1. 1

    Count or estimate input and output tokens for a representative call.

    A filing-summary call uses 8,000 input tokens and produces 1,200 output tokens.

  2. 2

    Convert provider per-million prices to per-1K by dividing by 1000.

    Input 5.00 per million is 0.005 per 1K; output 25.00 per million is 0.025 per 1K.

  3. 3

    Multiply each token count (in thousands) by its price and add.

    (8000/1000) x 0.005 + (1200/1000) x 0.025 = 8 x 0.005 + 1.2 x 0.025.

  4. 4

    Sum for the per-call cost, then multiply by call volume for the workload cost.

    0.040 + 0.030 = 0.070 per call; 10,000 calls per day is 700 per day.

Worked Example

Cost of one SEC-filing summarization call at flagship pricing

Input tokens

8,000

Output tokens

1,200

Input / output price (per 1M)

5.00 / 25.00

Per-1K prices: P_in = 5.00/1000 = 0.005, P_out = 25.00/1000 = 0.025. Input cost = (8000/1000) x 0.005 = 8 x 0.005 = 0.040. Output cost = (1200/1000) x 0.025 = 1.2 x 0.025 = 0.030. Total = 0.040 + 0.030 = 0.070.

About 0.07 per call. Note that the 1,200 output tokens cost almost as much (0.030) as the 8,000 input tokens (0.040), because output is priced 5x higher. Cutting the summary to 600 tokens would save 0.015 per call, far more than trimming a thousand input tokens. At 10,000 calls a day, that is the difference between 700 and 550 daily.

Common Variations

Cached-input pricing: providers charge a fraction (often 10 to 25%) for repeated prompt prefixes, which the prompt-cache break-even formula evaluates.
Batch pricing: asynchronous batch APIs commonly halve token prices for latency-tolerant workloads.
Per-million convention: many vendors quote per million tokens; the formula is identical with 1,000,000 in the denominator.

Try These Tools

Run the numbers next

Sources & References

Related Content

Keep the topic connected

Planning estimates only — not financial, tax, or investment advice.