Leaderboard · Compare · Mistral Large 2 vs Claude Sonnet 4 · Updated
Mistral Large 2 vs Claude Sonnet 4
Claude Sonnet 4 edges out Mistral Large 2 on the composite (80.7 vs 63.7). The gap is meaningful but not decisive — see the per-benchmark breakdown below.
At a glance
| Spec | Mistral Large 2 | Claude Sonnet 4 |
|---|---|---|
| Provider | Mistral AI | Anthropic |
| Released | 2024-07 | 2025-05 |
| Tier | general-purpose | general-purpose |
| License | Closed | Closed |
| Context window | 128k | 200k |
| $ in / out (per 1M) | $2.00 / $6.00 | $3.00 / $15.00 |
Benchmark scoreboard
Higher is better on every benchmark. Δ shows Mistral Large 2 − Claude Sonnet 4.
| Benchmark | Mistral Large 2 | Claude Sonnet 4 | Δ |
|---|---|---|---|
| Chatbot Arena Elo | 1251 | 1370 | -119 |
| MMLU-Pro | 69.4 | 84.0 | -14.6 |
| GPQA Diamond | 48.9 | 75.4 | -26.5 |
| MATH | 71.5 | 93.0 | -21.5 |
| HumanEval | 92.0 | 93.7 | -1.7 |
| SWE-Bench Verified | N/A | 72.7 | — |
Numbers compiled from provider technical reports and Chatbot Arena snapshots — see methodology.
OpenRouter routes Mistral Large 2, Claude Sonnet 4, and 100+ other LLMs behind a single API key — pay-as-you-go, no monthly minimum, fallback if a provider is down. Try OpenRouter → (affiliate · supports this site)
Mistral Large 2 vs Claude Sonnet 4: where each one wins
Mistral Large 2 is stronger on
No benchmarks where Mistral Large 2 beats Claude Sonnet 4 with comparable data.
Claude Sonnet 4 is stronger on
- Arena
- MMLU-Pro
- GPQA
- MATH
- HumanEval
Cost comparison
At 10M tokens/day (50/50 split), Mistral Large 2 costs ~$40.00/day vs $90.00/day for Claude Sonnet 4 — Mistral Large 2 is the cheaper pick at this volume.
Verdict
Claude Sonnet 4 edges out Mistral Large 2 on the composite (80.7 vs 63.7). The gap is meaningful but not decisive — see the per-benchmark breakdown below.
If you can only pick one and your workload is unclear, route via OpenRouter and switch by request — same key, no lock-in.
Frequently asked questions
Which is better, Mistral Large 2 or Claude Sonnet 4?
Claude Sonnet 4 edges out Mistral Large 2 on the composite (80.7 vs 63.7). The gap is meaningful but not decisive — see the per-benchmark breakdown below. Mistral Large 2 wins on no benchmarks; Claude Sonnet 4 wins on Arena, MMLU-Pro, GPQA, MATH, HumanEval.
What does Mistral Large 2 cost compared to Claude Sonnet 4?
At 10M tokens/day (50/50 split), Mistral Large 2 costs ~$40.00/day vs $90.00/day for Claude Sonnet 4 — Mistral Large 2 is the cheaper pick at this volume.
What is the context window of Mistral Large 2 vs Claude Sonnet 4?
Mistral Large 2: 128k tokens. Claude Sonnet 4: 200k tokens. Claude Sonnet 4 has the larger window — useful for long-document RAG and full-codebase prompting.
Is Mistral Large 2 or Claude Sonnet 4 open source?
Mistral Large 2: closed / proprietary. Claude Sonnet 4: closed / proprietary.
Can I try Mistral Large 2 and Claude Sonnet 4 on the same API key?
Yes — OpenRouter routes both models behind a single key, so you can A/B test Mistral Large 2 against Claude Sonnet 4 without juggling provider accounts.
Model deep-dives: Mistral Large 2 · Claude Sonnet 4 · Full leaderboard
Spotted out-of-date numbers? Open an issue — corrections usually ship within 24h.
Try Mistral Large 2 and Claude Sonnet 4 now
One API key, both models — switch between them per request and let real traffic pick the winner.