Subject

GEO & AI Citation

Generative Engine Optimization: how to get your content cited by Claude, ChatGPT, Perplexity, and Gemini. The discipline that took over from SEO when answer engines started doing the reading for users — what the citation mechanics actually look like, what works, and how to measure it.

A dark workbench with a tidy stack of pages and three labeled lines extending from one of them toward different AI assistant icons

As of 2026-05-24

As of 2026-05-24

The acronym GEO — Generative Engine Optimization — entered the vocabulary in 2023 via a paper from Aggarwal et al. (Princeton / Georgia Tech) that ran the first controlled study of what makes AI answer engines more likely to cite a source. The term has stuck because the underlying shift is real. When the user's interface to information becomes "ask Claude / ChatGPT / Perplexity / Gemini," ranking on Google starts to matter less than being cited in the AI's answer.

This article is the umbrella for what GEO is, how it differs from SEO, and what the mechanics actually look like.

The core difference

The cleanest definitional difference:

  • SEO optimizes for rank position on a search results page. The success metric is "appear in the top 3 for query X." The user clicks through; you measure success in clicks.
  • GEO optimizes for being chosen as a cited source inside a generated answer. The success metric is "Claude / ChatGPT / Perplexity actually cites our page when answering query X." The user may never click through; success is partly invisible.

Both depend on a crawl-and-index step, both depend on authority signals, both depend on structured content. They diverge on what happens after the retrieval. Search returns ten blue links; generative answer engines return a synthesized response, possibly with citations, and the choice of which sources to cite is its own selection problem on top of ranking.

The mechanics of citation

When an AI assistant generates an answer to a query, the typical pipeline (varying by platform):

  1. Crawl and index. A crawler — Perplexity's PerplexityBot, OpenAI's GPTBot/ChatGPT-User, Anthropic's ClaudeBot, Google's Google-Extended, etc. — fetches your pages. Some platforms maintain their own index (Perplexity, Brave); others lean on partnerships or licensed indexes (OpenAI has used Bing).
  2. Retrieval. When the user asks a question, the system retrieves a candidate set of pages relevant to the query.
  3. Generate with citations. The LLM reads the retrieved content and generates an answer, selecting which sources to cite inline based on what actually fed the answer.
  4. Render. The user sees the answer with the cited sources. They may click through; they may not.

For each of those steps there are levers you can pull. Most of GEO is about pulling those levers deliberately.

What works — the Aggarwal et al. findings (and the follow-ups)

The original paper tested several content modifications and measured changes in citation rate across generative search engines. The findings that have held up across follow-up work:

  • Adding statistics moves citation rate. Pages that include specific numbers, percentages, and quantitative claims get cited more often than equivalent prose. Generative answers value concrete data points; they are easier to lift into a response than vague paraphrase.
  • Adding citations to authoritative sources moves citation rate. Pages that themselves cite primary sources tend to get cited more, both because they look more credible and because the answer engine can chain the citation.
  • Fluent, well-structured prose moves citation rate. Easy-to-quote sentences are easier to put in an answer.
  • Classic keyword-density SEO does not move citation rate much. Stuffing the keyword has long been broken for Google ranking; for GEO it adds nothing.

The empirical picture: writing well, citing sources, and including data outperforms tactical optimization for AI citation.

What changed since 2023

A few important developments since the original paper:

  • The major AI assistants have stabilized their crawler and citation practices. Perplexity publishes PerplexityBot, OpenAI publishes GPTBot and ChatGPT-User, Anthropic publishes ClaudeBot, Google publishes Google-Extended. Robots.txt directives now work consistently across them.
  • Structured citation surfaces. Perplexity has had inline numbered citations from day one. SearchGPT and Gemini AI Overviews have both formalized their citation surfaces. Claude's web search returns sources alongside answers. Citation as a UX pattern is now standard.
  • The llms.txt proposal. Jeremy Howard introduced llms.txt in September 2024 — a simple markdown file at the root of your domain listing key pages and descriptions, in a format easy for LLMs to consume. Adoption is partial; the standard is genuinely useful as a discoverability layer even where no AI assistant formally honors it yet. See the-llms-txt-standard-explained.
  • Specialized analytics. Tools that specifically measure AI citation traffic (Profound, Otterly, Peec AI, Goodie, BrightEdge AI Visibility) have emerged because regular analytics platforms do not capture the citation step well. See measuring-ai-citation-traffic.

How to think about GEO if you're starting now

A short list of priorities, ordered:

  1. Make sure AI crawlers can read your site. Check that your robots.txt does not block GPTBot, ClaudeBot, PerplexityBot, Google-Extended. Some sites accidentally block them; some block them on purpose without thinking through the citation cost.
  2. Write with citations and data. This is the single highest-leverage editorial move. Concrete numbers, primary-source links, attributable claims. The AI assistants prefer to cite sources that themselves cite sources.
  3. Structure for scannable answers. Clear question-shaped headings, summary boxes near the top, FAQ schema where appropriate. Many AI assistants seem to weight content that looks like an answer.
  4. Ship an llms.txt. Cheap, useful even where not formally honored, signals to the AI ecosystem that you understand the medium.
  5. Set up citation measurement. You cannot optimize what you do not measure. Even a basic referrer log + AI-crawler tracking puts you ahead of most sites in 2026.

The companion articles in this cluster go deeper on each.

What GEO is not

A few things worth being explicit about:

  • It is not a separate web from SEO. The same site that ranks well for humans tends to be the same site AI assistants cite. The fundamentals (crawlability, authority, quality, structure) carry over almost entirely.
  • It is not a guarantee of traffic. Citation visibility is the new metric; click-through is the old one; both matter, neither is the same as the other.
  • It is not a substitute for being genuinely useful. The Aggarwal et al. finding that fluent, well-cited content wins is consistent with the older SEO finding that good content wins. GEO is mostly a more honest formulation of the same advice.

Working engineers and editorial teams that have been doing SEO well are most of the way to doing GEO well. The remaining gap is in measuring AI-specific surface area and in shipping the new artifacts (like llms.txt) the ecosystem now expects.

Forthcoming

  • Schema Markup for Llm Citation
  • Structuring Content for Ai Answers
  • Ai Crawler Allowlist

Where to go next

A short editorial reading list. Pick whichever fits how you like to learn.

  • NerdSip: 5-minute AI micro-course on almost any topic, on iOS and Android