LLM Rank.top

Leaderboard · Pricing · 30 models · Updated

LLM API pricing — every major model, cheapest first

Per-million-token prices for 30 commercial LLMs — input, output, output-to-input ratio, context window, and a one-click route to try each behind a single key. Grouped by tier so you can match price to your workload at a glance.

Cost calculator → Cheap-tier guide → Free-tier guide → Try any model via OpenRouter →
Don't juggle 30 provider keys to compare prices in production.

OpenRouter routes one API key across every model on this page — pay-as-you-go, no monthly minimum, real-time per-token billing. Switch models without code changes. Try OpenRouter → (affiliate · supports this site)

Cheapest 3 right now

Most-expensive frontier models

Blended cost = $/1M input + 4× $/1M output. Output usually dominates real bills, so the blended number is the best single proxy for "what this model will cost you per request" before you've measured your actual mix. For an exact answer, use the API cost calculator.

Cheap / fast tier — under $5 blended

Production-volume workhorses. The cheapest mainstream APIs on the market — pay-as-you-go and well under $1 per million input tokens.

ModelScore$ in / 1M$ out / 1MOut/inContext
Phi-4
Microsoft · open-weights
71.2 $0.07 $0.14 2.0× 16.384k Try →
Qwen2.5-Coder 32B
Alibaba · open-weights
68.8 $0.18 $0.18 1.0× 131.072k Try →
Gemini 2.0 Flash
Google · fast / cheap
65.6 $0.10 $0.40 4.0× 1M Try →
Llama 3.3 70B Instruct
Meta · open-weights
64.7 $0.23 $0.40 1.7× 128k Try →
Llama 3.1 70B Instruct
Meta · open-weights
60.2 $0.23 $0.40 1.7× 128k Try →
Qwen2.5 72B Instruct
Alibaba · open-weights
65.6 $0.35 $0.40 1.1× 131.072k Try →
GPT-4o mini
OpenAI · fast / cheap
61.3 $0.15 $0.60 4.0× 128k Try →
Codestral 25.01
Mistral AI · general-purpose
$0.30 $0.90 3.0× 256k Try →
DeepSeek V3
DeepSeek · open-weights
68.0 $0.27 $1.10 4.1× 128k Try →

Mid tier — $5 to $30 blended

General-purpose models that balance quality and cost — the default tier for most production deployments.

ModelScore$ in / 1M$ out / 1MOut/inContext
GPT-5 mini
OpenAI · fast / cheap
77.0 $0.25 $2.00 8.0× 400k Try →
DeepSeek R1
DeepSeek · open-weights
75.4 $0.55 $2.19 4.0× 128k Try →
Gemini 2.5 Flash
Google · fast / cheap
73.3 $0.30 $2.50 8.3× 1M Try →
Llama 3.1 405B Instruct
Meta · open-weights
65.7 $2.70 $2.70 1.0× 128k Try →
Claude 3.5 Haiku
Anthropic · fast / cheap
56.2 $0.80 $4.00 5.0× 200k Try →
o3-mini
OpenAI · fast / cheap
72.7 $1.10 $4.40 4.0× 200k Try →
Gemini 1.5 Pro
Google · general-purpose
67.9 $1.25 $5.00 4.0× 2M Try →
Mistral Large 2
Mistral AI · general-purpose
63.7 $2.00 $6.00 3.0× 128k Try →

Frontier tier — $30 to $100 blended

Top-of-leaderboard closed models. Pricing reflects R&D + premium inference; reserve for hard work.

ModelScore$ in / 1M$ out / 1MOut/inContext
o3
OpenAI · frontier
83.7 $2.00 $8.00 4.0× 200k Try →
GPT-4.1
OpenAI · general-purpose
74.5 $2.00 $8.00 4.0× 1M Try →
GPT-5
OpenAI · frontier
86.0 $1.25 $10.00 8.0× 400k Try →
Gemini 2.5 Pro
Google · frontier
80.9 $1.25 $10.00 8.0× 2M Try →
GPT-4o
OpenAI · general-purpose
66.8 $2.50 $10.00 4.0× 128k Try →
Command R+
Cohere · general-purpose
47.0 $2.50 $10.00 4.0× 128k Try →
Grok 4
xAI · frontier
83.6 $3.00 $15.00 5.0× 256k Try →
Grok 3
xAI · general-purpose
81.7 $3.00 $15.00 5.0× 1M Try →
Claude Sonnet 4
Anthropic · general-purpose
80.7 $3.00 $15.00 5.0× 200k Try →
Claude 3.7 Sonnet
Anthropic · general-purpose
76.0 $3.00 $15.00 5.0× 200k Try →
Claude 3.5 Sonnet
Anthropic · general-purpose
69.1 $3.00 $15.00 5.0× 200k Try →

Premium tier — over $100 blended

The most expensive models on the market. Use only when frontier reasoning is unavoidable.

ModelScore$ in / 1M$ out / 1MOut/inContext
o1
OpenAI · frontier
75.7 $15.00 $60.00 4.0× 200k Try →
Claude Opus 4.1
Anthropic · frontier
83.6 $15.00 $75.00 5.0× 200k Try →

How we collect prices

Every price on this page comes from the provider's published per-1M-token rate, last verified 2026-05-10. We do not include:

If you spot a stale number, open an issue — we update weekly.

Frequently asked questions

What is the cheapest LLM API right now?

The cheapest production-grade APIs in early 2026 are Gemini 2.0 Flash, GPT-4o mini, Claude 3.5 Haiku, DeepSeek V3, and Phi-4 — every one of them is well under $1 per million input tokens. The exact ranking depends on your input/output mix; the table on this page is sorted by blended (input + 4× output) cost so the order matches what you'll actually see on a typical bill.

Why does output cost more than input?

Generating tokens is much more compute-intensive than reading them. For most frontier models, output is 3–5× the input price. That means prompt-heavy workloads (RAG, classification, extraction) are far cheaper per call than generation-heavy workloads (long-form writing, code generation).

How do these prices compare to ChatGPT Plus or Claude Pro?

These are pay-as-you-go API prices, not consumer chat subscriptions. ChatGPT Plus / Pro and Claude Pro are flat-rate plans aimed at human chat use; the API prices above are what you pay per token when calling the model from your own application. For programmatic use, the API is almost always cheaper unless your usage is very low.

How can I cut LLM costs without changing models?

Three highest-impact moves: (1) cache static prompt prefixes — most providers now bill cached input tokens at 10–50% of the regular rate, (2) trim system prompts aggressively, (3) route easy queries to a cheaper model and only escalate to a frontier model when needed. OpenRouter exposes per-request model selection on a single API key, which makes A/B testing this trivial.

Is this list complete?

The table covers every major commercial LLM with a public API as of 2026-05-10 — 30 models from OpenAI, Anthropic, Google, Meta, DeepSeek, xAI, Mistral, Microsoft, Cohere, and Alibaba. Smaller providers and self-hosted setups (where you bear hardware cost instead of per-token cost) are out of scope.


See also: API cost calculator · Head-to-head comparisons · Best cheap LLM API guide · Best free LLM API guide

Get the weekly LLM digest

Big releases, leaderboard movements, price drops, and the one chart that actually mattered this week. No spam.

Or follow updates on GitHub.