LLM API Pricing Comparison 2026
Compare LLM API costs across major providers. Updated February 2026 with official pricing.
Official Provider Pricing
OpenAI (Feb 2026)
Source: openai.com/api/pricing
| Model |
Input |
Output |
Context |
Notes |
| GPT-5.2 |
$1.75/M |
$14/M |
200K |
Flagship coding/agent |
| GPT-5.2 pro |
$21/M |
$168/M |
200K |
Most capable |
| GPT-5 mini |
$0.25/M |
$2/M |
200K |
Fast, cheap |
| GPT-4.1 |
$3/M |
$12/M |
128K |
Reasoning |
| GPT-4.1 mini |
$0.80/M |
$3.20/M |
128K |
Budget |
| GPT-4.1 nano |
$0.20/M |
$0.80/M |
128K |
Fastest |
Anthropic (Feb 2026)
Source: claude.com/pricing
| Model |
Input |
Output |
Context |
Notes |
| Claude Opus 4.6 |
$5/M |
$25/M |
200K |
Most capable |
| Claude Sonnet 4.6 |
$3/M |
$15/M |
200K |
Best value |
| Claude Haiku 4.5 |
$1/M |
$5/M |
200K |
Fastest |
Google Gemini (Feb 2026)
Source: cloud.google.com/vertex-ai/pricing
| Model |
Input |
Output |
Context |
Notes |
| Gemini 2.5 Pro |
$1.25/M |
$10/M |
2M |
Best price/perf |
| Gemini 2.0 Flash |
$0.10/M |
$0.40/M |
1M |
Ultra cheap |
| Gemini 1.5 Pro |
$1.25/M |
$10/M |
2M |
Reliable |
DeepSeek
| Model |
Input |
Output |
Context |
Notes |
| DeepSeek V3 |
$0.27/M |
$1.10/M |
64K |
Open weights |
| DeepSeek R1 |
$0.27/M |
$1.10/M |
64K |
Reasoning |
Meta Llama (Open Weights)
| Model |
Input |
Output |
Context |
Notes |
| Llama 4 |
Free |
Free |
200K |
Open source |
| Llama 3.3 |
Free |
Free |
128K |
Open source |
Price Ranking (Standard Models)
| Rank |
Model |
Input |
Output |
$/1M (avg) |
| 1 |
Gemini 2.0 Flash |
$0.10 |
$0.40 |
$0.25 |
| 2 |
GPT-4.1 nano |
$0.20 |
$0.80 |
$0.50 |
| 3 |
Claude Haiku 4.5 |
$1.00 |
$5.00 |
$3.00 |
| 4 |
Gemini 2.5 Pro |
$1.25 |
$10.00 |
$5.63 |
| 5 |
GPT-5 mini |
$0.25 |
$2.00 |
$1.13 |
| 6 |
Claude Sonnet 4.6 |
$3.00 |
$15.00 |
$9.00 |
| 7 |
GPT-4.1 |
$3.00 |
$12.00 |
$7.50 |
| 8 |
Claude Opus 4.6 |
$5.00 |
$25.00 |
$15.00 |
| 9 |
GPT-5.2 |
$1.75 |
$14.00 |
$7.88 |
Cost Optimization
Prompt Caching (Major Savings)
| Provider |
Savings |
| OpenAI |
90% off cached |
| Anthropic |
90% off cached |
| Google |
75% off cached |
Batch API (50% Savings)
- OpenAI Batch API: 50% off
- Run async over 24 hours
Tips
- Use smaller models for simple tasks
- Cache prompts aggressively
- Use streaming to reduce perceived latency
Cost Calculator Resources
Recommendations by Use Case
| Use Case |
Model |
Est. Cost/Month |
| Chat app (10K users) |
Claude Haiku |
$30-100 |
| Code assistant |
GPT-5.2 |
$200-500 |
| High volume |
Gemini Flash |
$5-20 |
| Research |
Claude Opus |
$500-2000 |
| Budget app |
Gemini 2.0 Flash |
$5-10 |
Prices from official sources as of February 2026. Verify at provider websites before building.