LLM API Cost Calculator
Large language model APIs are priced per token, split into cheaper input tokens and more expensive output tokens. Enter the tokens per request, how many requests you run, and the per-million prices, and this free calculator returns the cost per request and your projected monthly and yearly spend — so you can budget an AI feature before you ship it.
How LLM API pricing works
Most large language model APIs bill by the token — roughly ¾ of a word in English. Each request has two parts that are priced differently: input tokens (your prompt, system instructions and any context you send in) and output tokens (what the model writes back). Output is usually several times more expensive than input, so long responses cost far more than long prompts.
Prices are quoted per million tokens. To get the cost of one request you scale those per-million prices down to the actual tokens used.
LLM cost formula
Cost per request = (input tokens ÷ 1,000,000 × input price) + (output tokens ÷ 1,000,000 × output price)
Monthly cost = cost per request × requests per month
Annual cost = monthly cost × 12
The two token counts and the two prices are all you need. Everything else — caching discounts, batch pricing, free tiers — adjusts these base numbers.
A worked example
A support assistant sends 1,000 input tokens and gets 500 output tokens per request, at $1 per million input and $5 per million output, running 100,000 times a month.
- Input cost: 1,000 ÷ 1M × $1 = $0.001
- Output cost: 500 ÷ 1M × $5 = $0.0025
- Cost per request: $0.0035
- Monthly: $0.0035 × 100,000 = $350
- Annual: $4,200
How to reduce LLM API costs
Because output tokens dominate, the biggest savings come from asking for shorter, more structured responses. Other reliable levers: trim the context you send each request, use a smaller/cheaper model for simple steps and reserve the expensive model for hard ones, and use prompt caching where the same long prefix repeats across requests — cached input is billed at a fraction of the normal rate. Batching non-urgent work often unlocks a further discount.
Why estimate cost before building
An AI feature that looks cheap per request can become expensive at scale — $0.0035 is nothing until you multiply it by millions of calls. Running the numbers up front tells you whether a feature is viable, which model tier fits the budget, and where to cap usage. It is the same discipline behind any automation decision: know the unit economics before you commit.
Frequently asked questions
Is the LLM API cost calculator free?
Yes, free and with no sign-up. The calculation runs in your browser and nothing you enter is stored or sent anywhere.
What is a token?
A token is a chunk of text the model processes — about ¾ of a word in English, so 1,000 tokens is roughly 750 words. Both your prompt and the model’s reply are measured in tokens.
Why are output tokens more expensive than input?
Generating text is more compute-intensive than reading it, so providers price output tokens higher — often three to five times the input price. That is why long responses cost more than long prompts.
How do I find the token counts for my use case?
Estimate from word count (words ÷ 0.75 ≈ tokens) or use your provider’s token-counting tool on a few representative prompts and responses, then average them.
Does this include caching or batch discounts?
No — it uses the standard per-million prices you enter. If you use prompt caching or batch pricing, lower the input price (or run a second estimate) to reflect the discounted rate.
This tool was built by our AI. Yours could be next.
We design and ship custom calculators, automations and AI agents for businesses — to production.
Talk to us →