LLM Rank.top

About LLM Rank

An independent, vendor-neutral leaderboard for large language models — built so engineers and researchers can pick the right model in under a minute.

Why this site exists

The major LLM vendors each publish their own leaderboards, and each conveniently has their own model on top. Independent academic boards (Chatbot Arena, OpenLLM Leaderboard, LiveBench) are excellent but each measure only one slice. llmrank.top combines those public signals into a single composite score, alongside the data engineers actually need before adopting a model: price per million tokens, context window, licence, and a direct link to try it.

Composite-score methodology

For every model we look up the published score on each of six benchmarks:

The composite is a weighted average over the benchmarks the model actually has data for. Missing benchmarks are dropped from that model's denominator rather than counted as zero — so a brand-new release with sparse public data is not unfairly penalised. Models with fewer than three published benchmarks are listed without a composite score, to prevent a single-benchmark specialist (e.g. a coding-only model with HumanEval but no Arena, MMLU, GPQA, etc.) from outranking well-rounded frontier models. Arena Elo is rescaled from a 1000–1500 band onto 0–100; percent benchmarks pass through unchanged. The full normalisation logic is in assets/js/site.js.

Where the numbers come from

Scores are compiled in this priority order:

  1. Provider technical report or model card (e.g. OpenAI system cards, Anthropic model cards, the DeepSeek-V3/R1 papers, Meta's Llama 3 / 3.1 / 3.3 papers).
  2. Independent academic snapshots: Chatbot Arena, LiveBench, Aider Polyglot, BigCodeBench.
  3. Reproductions on Hugging Face's Open LLM Leaderboard when the provider has not published its own number.

Pricing is the public list price published on the model's official API platform, in USD per 1M tokens, at the time of the last update. Context windows are the maximum supported in the official API, not a beta or research preview number.

Update cadence

The dataset is intended to be refreshed within a week of any major release. The "Last updated" date in the header reflects the most recent change to data/models.json. Significant releases (new flagship models, price cuts >30%) are typically pushed within 24 hours.

Editorial policy

Sponsorship & advertising

Reach engineers actively choosing an LLM

llmrank.top visitors are mid-funnel buyers: they have a model selection task open in another tab. We offer:

  • Sponsored slot at the top of the leaderboard (clearly labelled "Sponsored", not affecting ranks).
  • Logo placement on benchmark / pricing pages.
  • Newsletter sponsorship in the weekly LLM digest.
  • Custom benchmark posts ("How <your model> compares on coding tasks") with full editorial control disclosed.

Email hi@llmrank.top for the rate card.

Contact & corrections

Spotted a wrong number, missing model, or outdated price? Please open an issue on GitHub — that is the fastest path. For everything else: hi@llmrank.top.

About the maintainer

llmrank.top is maintained by @yuexiaoliang. The site is open source under the MIT licence; data is published under CC-BY-4.0 — feel free to fork, embed, or build on top of it.