Leaderboard · All models · 30 entries
Every LLM we track
Click any model for a full benchmark breakdown, API pricing, context window, and a one-click try-it link.
| # | Model | Composite | $ in / out | Context | |
|---|---|---|---|---|---|
| #1 | GPT-5 OpenAI · frontier |
86.0 | $1.25 / $10.00 | 400k | Details → |
| #2 | o3 OpenAI · frontier |
83.7 | $2.00 / $8.00 | 200k | Details → |
| #3 | Grok 4 xAI · frontier |
83.6 | $3.00 / $15.00 | 256k | Details → |
| #4 | Claude Opus 4.1 Anthropic · frontier |
83.6 | $15.00 / $75.00 | 200k | Details → |
| #5 | Grok 3 xAI · general-purpose |
81.7 | $3.00 / $15.00 | 1M | Details → |
| #6 | Gemini 2.5 Pro Google · frontier |
80.9 | $1.25 / $10.00 | 2M | Details → |
| #7 | Claude Sonnet 4 Anthropic · general-purpose |
80.7 | $3.00 / $15.00 | 200k | Details → |
| #8 | GPT-5 mini OpenAI · fast / cheap |
77.0 | $0.25 / $2.00 | 400k | Details → |
| #9 | Claude 3.7 Sonnet Anthropic · general-purpose |
76.0 | $3.00 / $15.00 | 200k | Details → |
| #10 | o1 OpenAI · frontier |
75.7 | $15.00 / $60.00 | 200k | Details → |
| #11 | DeepSeek R1 DeepSeek · open-weights |
75.4 | $0.55 / $2.19 | 128k | Details → |
| #12 | GPT-4.1 OpenAI · general-purpose |
74.5 | $2.00 / $8.00 | 1M | Details → |
| #13 | Gemini 2.5 Flash Google · fast / cheap |
73.3 | $0.30 / $2.50 | 1M | Details → |
| #14 | o3-mini OpenAI · fast / cheap |
72.7 | $1.10 / $4.40 | 200k | Details → |
| #15 | Phi-4 Microsoft · open-weights |
71.2 | $0.07 / $0.14 | 16.384k | Details → |
| #16 | Claude 3.5 Sonnet Anthropic · general-purpose |
69.1 | $3.00 / $15.00 | 200k | Details → |
| #17 | Qwen2.5-Coder 32B Alibaba · open-weights |
68.8 | $0.18 / $0.18 | 131.072k | Details → |
| #18 | DeepSeek V3 DeepSeek · open-weights |
68.0 | $0.27 / $1.10 | 128k | Details → |
| #19 | Gemini 1.5 Pro Google · general-purpose |
67.9 | $1.25 / $5.00 | 2M | Details → |
| #20 | GPT-4o OpenAI · general-purpose |
66.8 | $2.50 / $10.00 | 128k | Details → |
| #21 | Llama 3.1 405B Instruct Meta · open-weights |
65.7 | $2.70 / $2.70 | 128k | Details → |
| #22 | Gemini 2.0 Flash Google · fast / cheap |
65.6 | $0.10 / $0.40 | 1M | Details → |
| #23 | Qwen2.5 72B Instruct Alibaba · open-weights |
65.6 | $0.35 / $0.40 | 131.072k | Details → |
| #24 | Llama 3.3 70B Instruct Meta · open-weights |
64.7 | $0.23 / $0.40 | 128k | Details → |
| #25 | Mistral Large 2 Mistral AI · general-purpose |
63.7 | $2.00 / $6.00 | 128k | Details → |
| #26 | GPT-4o mini OpenAI · fast / cheap |
61.3 | $0.15 / $0.60 | 128k | Details → |
| #27 | Llama 3.1 70B Instruct Meta · open-weights |
60.2 | $0.23 / $0.40 | 128k | Details → |
| #28 | Claude 3.5 Haiku Anthropic · fast / cheap |
56.2 | $0.80 / $4.00 | 200k | Details → |
| #29 | Command R+ Cohere · general-purpose |
47.0 | $2.50 / $10.00 | 128k | Details → |
| #30 | Codestral 25.01 Mistral AI · general-purpose |
— | $0.30 / $0.90 | 256k | Details → |
Scores are compiled from public technical reports and Chatbot Arena snapshots. See methodology.