LLM Rank.top

Leaderboard · Compare · 30 matchups · Updated

LLM head-to-head comparisons

Every popular "X vs Y" matchup on one page. Each link goes to a side-by-side page with composite score, raw benchmark numbers, API pricing, cost-at-scale, and a verdict by use case.

A/B test any pair via OpenRouter → Custom 2-model comparison →
Don't pick blind — A/B test any two models on the same API key.

OpenRouter routes GPT-5, Claude, Gemini, DeepSeek, Llama, Grok, Qwen, Mistral and 100+ others behind a single key — pay-as-you-go, no monthly minimum. Try OpenRouter → (affiliate · supports this site)

Frontier head-to-heads

The closed flagships fighting for #1 on the leaderboard.

Closed vs open

Top closed-source model against the best open-weights challenger.

Cheap / fast tier

Production-volume mini models compared on price-per-quality.

Open-weights duels

Self-hostable models head-to-head — DeepSeek, Llama, Qwen, Mistral, Phi.

Cross-tier & cross-vendor

Other matchups — different tiers or vendors.

Looking for a specific pair we don't list? Use the custom 2-model comparison tool — every model on the leaderboard can be picked from the dropdown.