Leaderboard · Compare · Qwen2.5-Coder 32B vs Codestral 25.01 · Updated
Qwen2.5-Coder 32B vs Codestral 25.01
Qwen2.5-Coder 32B has the better-documented benchmark profile (composite 68.8). Codestral 25.01 is harder to compare quantitatively.
At a glance
| Spec | Qwen2.5-Coder 32B | Codestral 25.01 |
|---|---|---|
| Provider | Alibaba | Mistral AI |
| Released | 2024-11 | 2025-01 |
| Tier | open-weights | general-purpose |
| License | Open · Apache-2.0 | Closed |
| Context window | 131.072k | 256k |
| $ in / out (per 1M) | $0.18 / $0.18 | $0.30 / $0.90 |
Benchmark scoreboard
Higher is better on every benchmark. Δ shows Qwen2.5-Coder 32B − Codestral 25.01.
| Benchmark | Qwen2.5-Coder 32B | Codestral 25.01 | Δ |
|---|---|---|---|
| Chatbot Arena Elo | N/A | N/A | — |
| MMLU-Pro | 68.4 | N/A | — |
| GPQA Diamond | 40.0 | N/A | — |
| MATH | 83.1 | N/A | — |
| HumanEval | 92.7 | 86.6 | +6.1 |
| SWE-Bench Verified | N/A | N/A | — |
Numbers compiled from provider technical reports and Chatbot Arena snapshots — see methodology.
OpenRouter routes Qwen2.5-Coder 32B, Codestral 25.01, and 100+ other LLMs behind a single API key — pay-as-you-go, no monthly minimum, fallback if a provider is down. Try OpenRouter → (affiliate · supports this site)
Qwen2.5-Coder 32B vs Codestral 25.01: where each one wins
Qwen2.5-Coder 32B is stronger on
- HumanEval
Codestral 25.01 is stronger on
No benchmarks where Codestral 25.01 beats Qwen2.5-Coder 32B with comparable data.
Cost comparison
At 10M tokens/day (50/50 split), Qwen2.5-Coder 32B costs ~$1.80/day vs $6.00/day for Codestral 25.01 — Qwen2.5-Coder 32B is the cheaper pick at this volume.
Verdict
Qwen2.5-Coder 32B has the better-documented benchmark profile (composite 68.8). Codestral 25.01 is harder to compare quantitatively.
If you can only pick one and your workload is unclear, route via OpenRouter and switch by request — same key, no lock-in.
Frequently asked questions
Which is better, Qwen2.5-Coder 32B or Codestral 25.01?
Qwen2.5-Coder 32B has the better-documented benchmark profile (composite 68.8). Codestral 25.01 is harder to compare quantitatively. Qwen2.5-Coder 32B wins on HumanEval; Codestral 25.01 wins on no benchmarks.
What does Qwen2.5-Coder 32B cost compared to Codestral 25.01?
At 10M tokens/day (50/50 split), Qwen2.5-Coder 32B costs ~$1.80/day vs $6.00/day for Codestral 25.01 — Qwen2.5-Coder 32B is the cheaper pick at this volume.
What is the context window of Qwen2.5-Coder 32B vs Codestral 25.01?
Qwen2.5-Coder 32B: 131.072k tokens. Codestral 25.01: 256k tokens. Codestral 25.01 has the larger window — useful for long-document RAG and full-codebase prompting.
Is Qwen2.5-Coder 32B or Codestral 25.01 open source?
Qwen2.5-Coder 32B: open weights (Apache-2.0). Codestral 25.01: closed / proprietary.
Can I try Qwen2.5-Coder 32B and Codestral 25.01 on the same API key?
Yes — OpenRouter routes both models behind a single key, so you can A/B test Qwen2.5-Coder 32B against Codestral 25.01 without juggling provider accounts.
Model deep-dives: Qwen2.5-Coder 32B · Codestral 25.01 · Full leaderboard
Spotted out-of-date numbers? Open an issue — corrections usually ship within 24h.
Try Qwen2.5-Coder 32B and Codestral 25.01 now
One API key, both models — switch between them per request and let real traffic pick the winner.