LLM Rank.top

Leaderboard · Compare · GPT-5 vs Claude Sonnet 4 · Updated

GPT-5 vs Claude Sonnet 4

GPT-5 edges out Claude Sonnet 4 on the composite (86.0 vs 80.7). The gap is meaningful but not decisive — see the per-benchmark breakdown below.

GPT-5 · composite 86.0 Claude Sonnet 4 · composite 80.7 frontier vs general-purpose
Try GPT-5 → Try Claude Sonnet 4 → A/B test both via OpenRouter →

At a glance

SpecGPT-5Claude Sonnet 4
ProviderOpenAIAnthropic
Released2025-082025-05
Tierfrontiergeneral-purpose
LicenseClosedClosed
Context window400k200k
$ in / out (per 1M)$1.25 / $10.00$3.00 / $15.00

Benchmark scoreboard

Higher is better on every benchmark. Δ shows GPT-5 − Claude Sonnet 4.

BenchmarkGPT-5Claude Sonnet 4Δ
Chatbot Arena Elo 1410 1370 +40
MMLU-Pro 86.8 84.0 +2.8
GPQA Diamond 87.3 75.4 +11.9
MATH 96.7 93.0 +3.7
HumanEval 95.1 93.7 +1.4
SWE-Bench Verified 74.9 72.7 +2.2

Numbers compiled from provider technical reports and Chatbot Arena snapshots — see methodology.

Don't pick blind — A/B test both models on the same API key.

OpenRouter routes GPT-5, Claude Sonnet 4, and 100+ other LLMs behind a single API key — pay-as-you-go, no monthly minimum, fallback if a provider is down. Try OpenRouter → (affiliate · supports this site)

GPT-5 vs Claude Sonnet 4: where each one wins

GPT-5 is stronger on

  • Arena
  • MMLU-Pro
  • GPQA
  • MATH
  • HumanEval
  • SWE-Bench

Claude Sonnet 4 is stronger on

No benchmarks where Claude Sonnet 4 beats GPT-5 with comparable data.

Cost comparison

At 10M tokens/day (50/50 split), GPT-5 costs ~$56.25/day vs $90.00/day for Claude Sonnet 4 — GPT-5 is the cheaper pick at this volume.

Verdict

GPT-5 edges out Claude Sonnet 4 on the composite (86.0 vs 80.7). The gap is meaningful but not decisive — see the per-benchmark breakdown below.

If you can only pick one and your workload is unclear, route via OpenRouter and switch by request — same key, no lock-in.

Frequently asked questions

Which is better, GPT-5 or Claude Sonnet 4?

GPT-5 edges out Claude Sonnet 4 on the composite (86.0 vs 80.7). The gap is meaningful but not decisive — see the per-benchmark breakdown below. GPT-5 wins on Arena, MMLU-Pro, GPQA, MATH, HumanEval, SWE-Bench; Claude Sonnet 4 wins on no benchmarks.

What does GPT-5 cost compared to Claude Sonnet 4?

At 10M tokens/day (50/50 split), GPT-5 costs ~$56.25/day vs $90.00/day for Claude Sonnet 4 — GPT-5 is the cheaper pick at this volume.

What is the context window of GPT-5 vs Claude Sonnet 4?

GPT-5: 400k tokens. Claude Sonnet 4: 200k tokens. GPT-5 has the larger window — useful for long-document RAG and full-codebase prompting.

Is GPT-5 or Claude Sonnet 4 open source?

GPT-5: closed / proprietary. Claude Sonnet 4: closed / proprietary.

Can I try GPT-5 and Claude Sonnet 4 on the same API key?

Yes — OpenRouter routes both models behind a single key, so you can A/B test GPT-5 against Claude Sonnet 4 without juggling provider accounts.


Model deep-dives: GPT-5 · Claude Sonnet 4 · Full leaderboard

Spotted out-of-date numbers? Open an issue — corrections usually ship within 24h.

Try GPT-5 and Claude Sonnet 4 now

One API key, both models — switch between them per request and let real traffic pick the winner.

Try GPT-5 → Try Claude Sonnet 4 → A/B test both via OpenRouter →