What to learn first
-
Prompting Fundamentals
The mechanics every prompt rests on: system vs. user messages, context windows, zero- and few-shot patterns, templates, and how to give a model what it needs to answer well.
Take me to the Fundamentals hub -
Prompting Patterns
Reusable techniques like role prompting, tree-of-thoughts, ReAct, prompt chaining, self-consistency, and negative prompting — with the kind of example you can paste and modify.
Take me to the Patterns hub -
Prompting By Use Case
Long-form, specific guides for the actual things people prompt for: code, writing, summarization, data extraction, classification, and analysis.
Take me to the Use Cases hub -
Local Models
Running LLMs on your own hardware: how the stack works, which runtimes to pick, what quantization actually changes, and which open-weight models are genuinely usable right now.
Take me to the Local hub -
Model Benchmarks
Honest head-to-heads between frontier and open-weight models. We disclose the prompts, the temperature, the seed, and the limits — every comparison is timestamped.
Take me to the Bench hub -
Release Radar
A dated, sourced tracker of new and rumored model releases. Every claim is tagged Confirmed, Strong signal, or Speculation, with a link back to the primary source.
Take me to the Radar hub -
Agents & Tool Use
Building LLM agents that actually do useful work: the agent loop, tool calling across major APIs, the Model Context Protocol, and the failure modes that make agents the wrong shape for many problems.
Take me to the Agents hub -
Context Engineering
The discipline that replaced clever prompts once context windows got large: deciding what information goes into the model's input, in what order, how much, and what to leave out. Includes the failure modes of long contexts and the memory patterns production systems use.
Take me to the Context hub -
Evals & Testing
How working teams actually evaluate LLM applications: building golden sets, designing rubrics, using LLM-as-judge without inheriting its biases, regression-testing prompts in CI, and which of the dozen eval frameworks is worth picking up.
Take me to the Evals hub -
RAG & Retrieval
Retrieval-augmented generation, end to end: the pipeline that actually fires in production, how to chunk documents without breaking meaning, when to use vector vs keyword vs hybrid retrieval, and when RAG beats long-context (and when it does not).
Take me to the RAG hub