As of 2026-05-16
As of 2026-05-16
If you came here looking for a list of "the best open-weight models right now," you are in the wrong neighborhood. That list exists, in real time, on the Hugging Face pages of the labs that publish the models — and any specific version we named here would be wrong within weeks. What this page does instead is explain how we read the landscape, so you can read it yourself.
How to read open-weight news
Three labels do the bulk of the work, applied consistently across every model-release claim we publish:
- Confirmed. A lab has published a model card with downloadable weights. Nothing else qualifies, no matter how loudly the model was previewed.
- Strong signal. Multiple independent reputable reports, or one report backed by an artifact: a config file, a server endpoint, a model-card stub without weights, an official "we will release weights for X" statement.
- Speculation. Cadence-based or inferential. Useful for context. Not a prediction.
The same labels apply to everything in this cluster. The full methodology lives at how-to-read-llm-release-rumors.
Where the real state lives
We deliberately do not maintain a "current top open-weight models" list here because that list goes stale faster than this article can be republished. The canonical primary source for each major lab is its own Hugging Face organization page or news index:
- Meta Llama — huggingface.co/meta-llama
- Alibaba Qwen — huggingface.co/Qwen and github.com/QwenLM
- Mistral AI — huggingface.co/mistralai and mistral.ai/news
- DeepSeek — huggingface.co/deepseek-ai and github.com/deepseek-ai
- Google DeepMind (Gemma) — deepmind.google/technologies/gemma and huggingface.co/google
- Zhipu / GLM — huggingface.co/THUDM
- Microsoft Phi — huggingface.co/microsoft
Open one tab per lab and you have a more accurate "what shipped this month" view than any tracker article.
One verified example, for shape
To make the methodology concrete: the Gemma 4 family (Google DeepMind) shipped as open weights on 2026-04-03, in 2B, 9B, and 27B sizes with base and instruction-tuned variants. The primary source is the Gemma overview page above, plus the corresponding Hugging Face model cards. That is what a Confirmed claim looks like in this cluster: a date you can verify, a model card you can open, no third-party-tracker-only sourcing.
If you read a sentence about a specific open-weight model on this site that does not meet that bar, it is either wrong or should not have been published. We try hard to keep it the latter.
What is announced but not yet shipped
A model belongs here when the lab itself has publicly committed to a future open-weight release but the weights are not yet downloadable: a forthcoming-release blog post, a keynote with a date, a Hugging Face placeholder card. Once weights land, the claim moves to Confirmed and the dated comparison work moves to the Bench cluster.
We do not enumerate the current "announced but not shipped" set here. It changes too fast. Check each lab's own announcements.
What is genuinely speculative
Cadence claims ("Llama tends to ship every N months"), capability extrapolations ("the next release will probably add X"), and pricing-tier predictions are all Speculation in this framework. They have explanatory value when paired with the underlying release log or paper. They have no value as predictions, and we do not use them that way.
The honest read of 2026: the major labs ship faster than any monthly article can keep up with. Most "open-weight release" predictions older than a few weeks are stale by the time you read them. Pair this page's framework with the lab links above, and you have everything you need to read the landscape better than the headlines do.
What this cluster is not
Not a leaderboard. Not a buying guide. Not a venue for unsourced screenshots. The Bench cluster handles dated head-to-heads with disclosed methodology. The Local Models cluster handles which model to actually run on your hardware. This cluster handles "what is coming, with what confidence" — and where the confidence is low, the article says so plainly.
archiving snapshots rather than rewriting them is the right call for dated stuff like this. you can actually trace what was true at the time later