How is this different from a Hugging Face leaderboard?

Leaderboards rank shipped models by benchmark scores. This tracker also covers what is *about to ship*: confirmed announcements, strong-signal leaks, and tagged speculation. It is forward-looking, dated, and sourced. Leaderboards are excellent for benchmarks; the tracker is for "should I wait two weeks before picking a model?"

Why do labels sometimes move down toward Speculation?

Because evidence weakens. A Strong-signal claim based on multiple reports gets demoted to Speculation if those reports turn out to share one upstream source. Demotions get a date and an edit-log entry; we do not silently rewrite history.

What if a model ships between trackers?

It moves from this article into the Bench cluster as a dated head-to-head. The tracker keeps a one-line "shipped: see [bench article]" link and moves on to the next set of rumors.

Open-Weight Model Landscape — How to Read It, May 2026

As of 2026-05-16

If you came here looking for a list of "the best open-weight models right now," you are in the wrong neighborhood. That list exists, in real time, on the Hugging Face pages of the labs that publish the models — and any specific version we named here would be wrong within weeks. What this page does instead is explain how we read the landscape, so you can read it yourself.

How to read open-weight news

Three labels do the bulk of the work, applied consistently across every model-release claim we publish:

Confirmed. A lab has published a model card with downloadable weights. Nothing else qualifies, no matter how loudly the model was previewed.
Strong signal. Multiple independent reputable reports, or one report backed by an artifact: a config file, a server endpoint, a model-card stub without weights, an official "we will release weights for X" statement.
Speculation. Cadence-based or inferential. Useful for context. Not a prediction.

The same labels apply to everything in this cluster. The full methodology lives at how-to-read-llm-release-rumors.

Where the real state lives

We deliberately do not maintain a "current top open-weight models" list here because that list goes stale faster than this article can be republished. The canonical primary source for each major lab is its own Hugging Face organization page or news index:

Meta Llama — huggingface.co/meta-llama
Alibaba Qwen — huggingface.co/Qwen and github.com/QwenLM
Mistral AI — huggingface.co/mistralai and mistral.ai/news
DeepSeek — huggingface.co/deepseek-ai and github.com/deepseek-ai
Google DeepMind (Gemma) — deepmind.google/technologies/gemma and huggingface.co/google
Zhipu / GLM — huggingface.co/THUDM
Microsoft Phi — huggingface.co/microsoft

Open one tab per lab and you have a more accurate "what shipped this month" view than any tracker article.

One verified example, for shape

To make the methodology concrete: the Gemma 4 family (Google DeepMind) shipped as open weights on 2026-04-03, in 2B, 9B, and 27B sizes with base and instruction-tuned variants. The primary source is the Gemma overview page above, plus the corresponding Hugging Face model cards. That is what a Confirmed claim looks like in this cluster: a date you can verify, a model card you can open, no third-party-tracker-only sourcing.

If you read a sentence about a specific open-weight model on this site that does not meet that bar, it is either wrong or should not have been published. We try hard to keep it the latter.

What is announced but not yet shipped

A model belongs here when the lab itself has publicly committed to a future open-weight release but the weights are not yet downloadable: a forthcoming-release blog post, a keynote with a date, a Hugging Face placeholder card. Once weights land, the claim moves to Confirmed and the dated comparison work moves to the Bench cluster.

We do not enumerate the current "announced but not shipped" set here. It changes too fast. Check each lab's own announcements.

What is genuinely speculative

Cadence claims ("Llama tends to ship every N months"), capability extrapolations ("the next release will probably add X"), and pricing-tier predictions are all Speculation in this framework. They have explanatory value when paired with the underlying release log or paper. They have no value as predictions, and we do not use them that way.

The honest read of 2026: the major labs ship faster than any monthly article can keep up with. Most "open-weight release" predictions older than a few weeks are stale by the time you read them. Pair this page's framework with the lab links above, and you have everything you need to read the landscape better than the headlines do.

What this cluster is not

Not a leaderboard. Not a buying guide. Not a venue for unsourced screenshots. The Bench cluster handles dated head-to-heads with disclosed methodology. The Local Models cluster handles which model to actually run on your hardware. This cluster handles "what is coming, with what confidence" — and where the confidence is low, the article says so plainly.

Open-Weight Model Landscape — How to Read It, May 2026

Article summary

How to read open-weight news

Where the real state lives

One verified example, for shape

What is announced but not yet shipped

What is genuinely speculative

What this cluster is not

Frequently asked questions

See also

Where to go next

Comments 1