LLM API Pricing (July 2026): GPT-5.4 $2.50/M · Claude Sonnet 5 $2/M · Full Table
Compare 23 verified LLM API prices across 7 providers. LFM2 24B A2B on Together starts at $0.03/M input tokens; includes GPT, Claude, Gemini, DeepSeek, Groq, xAI, and official source links.
Verified pricing dataset
LLM API prices from tracked source data
Prices are shown only after source verification. Pending providers are tracked but excluded from rankings and calculators.
- As of
- Update cadence
- Checked daily; published only after source verification.
- Source policy
- Official provider pricing pages or APIs only.
- Machine-readable data
- /api/pricing.json
- Tracked sources
- https://api-docs.deepseek.com/quick_start/pricinghttps://platform.openai.com/docs/pricinghttps://docs.anthropic.com/en/docs/about-claude/pricinghttps://ai.google.dev/gemini-api/docs/pricinghttps://docs.x.ai/overviewhttps://www.together.ai/pricinghttps://groq.com/pricing/https://docs.mistral.ai/getting-started/models/models_overview/https://fireworks.ai/pricinghttps://openrouter.ai/api/v1/models
Quick pricing answers
Cheapest input
LFM2 24B A2B on Together
$0.03/M input · $0.12/M output
Cheapest output
Llama 3.1 8B Instant on Groq
$0.08/M output
Lowest first-party lab price
Gemini 2.5 Flash-Lite
Google · $0.10/M input
Best cached-input price
DeepSeek V4 Flash
$0.0028/M cached input
These answers are generated from 23 verified models across 7 providers as of July 2026. Use the calculator for workload-specific totals because output tokens and cache hit rate can change the cheapest choice.
| Provider | Model | Input / 1M | Cached input / 1M | Output / 1M | Context | Verified |
|---|---|---|---|---|---|---|
| DeepSeek | DeepSeek V4 Flash | $0.14 | $0.0028 | $0.28 | 1,000,000 | July 2026 |
| DeepSeek | DeepSeek V4 Pro | $0.435 | $0.0036 | $0.87 | 1,000,000 | July 2026 |
| OpenAI | GPT-5.5 | $5.00 | $0.50 | $30.00 | n/a | July 2026 |
| OpenAI | GPT-5.4 | $2.50 | $0.25 | $15.00 | n/a | July 2026 |
| OpenAI | GPT-5.4 mini | $0.75 | $0.075 | $4.50 | n/a | July 2026 |
| OpenAI | GPT-5.4 nano | $0.20 | $0.02 | $1.25 | n/a | July 2026 |
| Anthropic | Claude Opus 4.8 | $5.00 | $0.50 | $25.00 | n/a | July 2026 |
| Anthropic | Claude Sonnet 5 | $2.00 | $0.20 | $10.00 | n/a | July 2026 |
| Anthropic | Claude Sonnet 4.6 | $3.00 | $0.30 | $15.00 | n/a | July 2026 |
| Anthropic | Claude Haiku 4.5 | $1.00 | $0.10 | $5.00 | n/a | July 2026 |
| Gemini 3 Flash Preview | $0.50 | $0.05 | $3.00 | n/a | July 2026 | |
| Gemini 2.5 Pro | $1.25 | $0.125 | $10.00 | 1,000,000 | July 2026 | |
| Gemini 2.5 Flash | $0.30 | $0.03 | $2.50 | 1,000,000 | July 2026 | |
| Gemini 2.5 Flash-Lite | $0.10 | $0.01 | $0.40 | n/a | July 2026 | |
| xAI | Grok Build 0.1 | $1.00 | n/a | $2.00 | 256,000 | July 2026 |
| Together AI | DeepSeek V4 Pro on Together | $1.74 | $0.20 | $3.48 | n/a | July 2026 |
| Together AI | MiniMax M3 on Together | $0.30 | $0.06 | $1.20 | n/a | July 2026 |
| Together AI | gpt-oss-120B on Together | $0.15 | n/a | $0.60 | n/a | July 2026 |
| Together AI | LFM2 24B A2B on Together | $0.03 | n/a | $0.12 | n/a | July 2026 |
| Groq | GPT OSS 20B on Groq | $0.075 | $0.038 | $0.30 | 128,000 | July 2026 |
| Groq | GPT OSS 120B on Groq | $0.15 | $0.075 | $0.60 | 128,000 | July 2026 |
| Groq | Llama 4 Scout on Groq | $0.11 | n/a | $0.34 | 128,000 | July 2026 |
| Groq | Llama 3.1 8B Instant on Groq | $0.05 | n/a | $0.08 | 128,000 | July 2026 |
Pending source verification
Mistral
Official model docs are tracked, but a stable per-model token pricing table was not found in the fetched docs. Do not publish Mistral prices until verified from an official pricing table.
https://docs.mistral.ai/getting-started/models/models_overview/Fireworks
Official pricing page exposes serverless, fine-tuning, and GPU-hour pricing, but the fetched content points per-token estimates to a separate blog. Keep token pricing pending until a stable official per-model token table is available.
https://fireworks.ai/pricingOpenRouter
Provider exposes model pricing through its models API; ingestion needs a deterministic model-selection policy before publishing aggregate OpenRouter prices.
https://openrouter.ai/api/v1/modelsPricing questions
What is the cheapest LLM API in July 2026?
LFM2 24B A2B on Together has the lowest verified input-token price at $0.03/M input tokens.
Which model is cheapest for generation-heavy workloads?
Llama 3.1 8B Instant on Groq has the lowest verified output-token price at $0.08/M output tokens.
Why are some providers still pending?
TLDL publishes prices only when the official source exposes stable per-model token pricing. Pending rows keep tracked sources visible without mixing guessed prices into the table.
Pricing data changelog
2026-07-04 · DeepSeek
Verified DeepSeek V4 Flash and V4 Pro pricing from the official DeepSeek API pricing page.
https://api-docs.deepseek.com/quick_start/pricing2026-07-04 · OpenAI
Verified GPT-5.5 and GPT-5.4 family standard API pricing from the official OpenAI pricing page.
https://platform.openai.com/docs/pricing2026-07-04 · Anthropic
Verified current Claude first-party API pricing from the official Anthropic pricing page.
https://docs.anthropic.com/en/docs/about-claude/pricing2026-07-04 · Google
Verified Gemini 3 Flash and Gemini 2.5 family standard text pricing from the official Gemini API pricing page.
https://ai.google.dev/gemini-api/docs/pricing2026-07-04 · Groq
Verified Groq on-demand LLM token pricing from the official Groq pricing page.
https://groq.com/pricing/2026-07-04 · Together AI
Verified selected Together AI serverless inference prices from the official Together AI pricing page.
https://www.together.ai/pricing2026-07-04 · xAI
Verified Grok Build 0.1 token pricing from the official xAI docs overview.
https://docs.x.ai/overview2026-07-04 · TLDL
Kept Mistral, Fireworks token inference, and OpenRouter aggregate pricing as source-pending where the official source is not a stable per-model token table in this dataset yet.
https://www.tldl.io/api/pricing.jsonThis page ranks only models with verified pricing in TLDL's shared dataset. The table above is generated from the same source used by the public pricing API, the LLM API pricing comparison, and the LLM cost calculator.
Pending providers are tracked but excluded from cheapest-model claims until their official source data is verified. That rule prevents stale copied numbers from being ranked as if they were current.
Current verified budget options
The verified table above is the source of truth. Sort by input price for routing, extraction, classification, and batch-analysis workloads. Sort by output price for summarization, chat, long-form generation, and agent workflows where responses can become large.
When cheap is not actually cheap
Low token prices can be offset by retries, weaker tool calling, latency, lower success rates, or longer generated outputs. Test cost per completed task, not just cost per token.
How to compare models
Use the LLM cost calculator with your expected input tokens, output tokens, and cache behavior. A small model can be cheaper for simple routing and more expensive for hard tasks if it needs repeated calls.
Related
Related Resources
Follow LLM pricing updates
Read website-published updates when TLDL verifies pricing changes, data refreshes, or useful cost comparisons.
Published on TLDL. Follow the newsletter RSS feed for lightweight updates.