GEO: How to Get Cited by AI (A Practitioner's Playbook)

SEO got you into ten blue links. GEO gets you into the answer.

Generative Engine Optimization is the practice of structuring your content so AI platforms — ChatGPT, Perplexity, Google AI Overviews, Claude, Gemini — cite you when answering questions. It's not a rebrand of SEO. The mechanics are different, the signals are different, and the thing that made SEO work — keyword density — actively hurts you here.

I've been deploying AI systems in enterprise settings for several years and writing about context engineering since early 2025. Over the past year, I've watched GEO go from a niche academic concept to something every business deploying content needs to understand. This is what I've learned — what the research says, what I've seen work, and what's coming next.

The shift is already measurable

AI search isn't coming. It's here. And it changes the math on visibility.

SEO is a fight for one of ten positions on a results page. GEO is a fight for one of 2-7 citations in a synthesized answer. Fewer slots, but each one carries more weight — an AI citation is an implicit endorsement, not just a link.

The platforms have different personalities. This matters more than most people realize:

Platform	Avg Citations/Query	Favors	Key Signal
ChatGPT	7.9	Wikipedia, mainstream news, .edu	Recency — 76% of top-cited pages were updated in the last 30 days
Perplexity	21.9	Reddit, specialized content, 3K+ word guides	Depth and comprehensiveness
Google AI Overviews	Varies	YouTube, Reddit, diverse sources	Multi-perspective — highest citation diversity

Here's the number that should get your attention: only 11% of domains are cited by both ChatGPT and Perplexity. They're different ecosystems. Optimizing for one doesn't guarantee the other.

What the research says

Princeton and Georgia Tech published the definitive GEO study (KDD 2024) — nine optimization methods tested across over a million AI-generated responses. The results are clear:

Method	Visibility Impact	Effort
Cite external sources inline	+30-40%	Low
Add statistics with attribution	+30-40%	Low
Add named expert quotes	+30-40%	Low
Use technical terminology	+28%	Low
Keyword stuffing	-10% (harmful)	—

Three things stand out.

First, the top three methods all require minimal content changes but deliver 30-40% improvement in AI visibility. Inline citations, specific statistics, and attributed expert quotes. That's a meaningful lift for work you can do in a week.

Second, keyword stuffing — the move that built the SEO industry — is actively harmful in GEO. Negative ten percent. If your content team is still optimizing for keyword density, they're making you less visible to AI, not more.

Third, authoritative tone by itself underperformed. AI systems don't care that you sound authoritative. They care that you are authoritative — which means specific claims, named sources, and verifiable data.

How AI crawlers actually work

Before tactics, you need to understand the infrastructure. AI platforms send dedicated bots to read your site:

Bot	Operator	Monthly Fetches
Googlebot	Google	4.5B
GPTBot	OpenAI	569M
ClaudeBot	Anthropic	370M
Applebot	Apple	314M
PerplexityBot	Perplexity	24.4M

There's one constraint that kills most sites before the content even matters: most AI crawlers cannot execute JavaScript. If your content loads dynamically via JS frameworks, these bots see a blank page. Server-side rendering or static generation isn't optional — it's a prerequisite.

Check your robots.txt. Many sites accidentally block GPTBot or ClaudeBot with broad Disallow rules. If you're blocked, you're invisible. Full stop.

The practitioner's playbook

I break this into four layers. Do them in order — each one depends on the one before it.

Layer 1: Technical foundation (do this first)

Allow AI crawlers in robots.txt — verify GPTBot, ClaudeBot, PerplexityBot, ChatGPT-User are not blocked
Server-side render everything — no content behind JS
HTTPS sitewide
Page speed under 1.8s mobile
Implement llms.txt at your domain root (more on this below)

If any of these are broken, nothing else matters. An AI crawler that can't read your site can't cite your site.

Layer 2: Content structure (make it extractable)

AI systems don't evaluate pages. They evaluate passages. Each section under an H2 or H3 heading is parsed independently for relevance and factual density. A single strong paragraph on an otherwise mediocre page can still get cited.

Structure for extraction:

Answer-first format. Start every section with a direct, standalone answer before expanding. The opening sentence under an H2 is the most extractable unit on the page.
Clean heading hierarchy. H2 and H3 tags are passage boundary signals. One topic per heading. Don't skip levels.
Tables and numbered lists. Highly parseable. Comparative data, feature matrices, step-by-step instructions — AI systems love these formats because they're modular and unambiguous.
Explicit "Last Updated" dates. Display them. AI systems use freshness as a citation signal. AI-cited content is 25.7% fresher on average than traditional Google organic results.

Layer 3: Content quality (make it citable)

This is where the Princeton research pays off. Three changes, 30-40% lift:

Cite sources inline. Every factual claim gets a source. "According to [Source], [specific number]." Not "studies show" — name the study.
Replace qualitative with quantitative. "Significant growth" becomes "32% YoY growth (Source, 2025)." Specific numbers with attribution.
Attribute claims to named experts. "As [Name], [Title] at [Company] notes: '[quote].'" Named experts with credentials signal citation-worthiness.

And cut keyword stuffing. If your content reads like it was written for a search engine, rewrite it for a reader who happens to be an AI.

Layer 4: Authority (off-site signals)

Third-party mentions are the highest-signal non-technical lever for GEO. Each platform has its favorite sources:

Wikipedia — ChatGPT draws 4.8% of its citations from Wikipedia. If you don't have an entry, you're leaving visibility on the table.
Reddit — Perplexity draws 6.6% from Reddit. Genuine community participation, not spam.
YouTube — Google AI Overviews cite YouTube heavily. Video content matters.
Digital PR — Target 20+ high-authority domain mentions per quarter.

llms.txt — the entry point

If you haven't implemented llms.txt, start here. It's a markdown file at your domain root that gives AI systems a curated map of your most important content. Instead of burning tokens on messy HTML, the AI gets a clean, structured overview.

844,000+ sites have adopted it. The format is simple:

# Your Company Name

> What you do, who you serve, what matters.

## Core Documentation
- [Product Overview](https://yoursite.com/platform): What it does
- [Pricing](https://yoursite.com/pricing): Plans and tiers

## Use Cases
- [Enterprise](https://yoursite.com/enterprise): How enterprises use it

Include your 10-20 most important URLs with descriptive annotations. Curate aggressively — the point is signal, not completeness. Maintain it quarterly.

Honest caveat: no major LLM provider has officially confirmed they use llms.txt. But server log monitoring from multiple sources shows Microsoft and OpenAI crawlers accessing these files. The cost is a few hours of setup. The potential upside justifies it.

What's next: from lists to graphs

llms.txt is a reading list. It tells an AI "here are our important pages." That's valuable, but it doesn't convey how concepts relate to each other, which content supersedes what, or how to traverse from one idea to the next.

The infrastructure stack is evolving toward structured knowledge graphs — context files that give AI not just a list of content but a map of relationships. Think of it as the difference between handing someone a bibliography and handing them an org chart with annotations.

I wrote about the need for a control plane last year. The GEO application of that idea is becoming clear: businesses that publish machine-readable knowledge graphs — with explicit edges between concepts, metadata on authorship and freshness, and traversal logic for AI agents — will have a structural advantage over businesses that publish flat content and hope the AI figures out the relationships.

This is early. But the trajectory is obvious. robots.txt → sitemap.xml → schema.org → llms.txt → structured context graphs. Each layer gave machines a deeper understanding of what a business actually is. The next layer gives them comprehension, not just access.

Measuring GEO

You can't manage what you don't measure. The core metric is Share of Model (SoM):

(Your Citations / Total Citations in your category) × 100

How to measure it for free:

Define 20-50 prompts covering awareness, consideration, and decision-stage queries in your space
Run them across ChatGPT, Perplexity, Google AI Overviews, and Claude
Record: were you cited? What position? Was it accurate?
Calculate your baseline SoM
Track monthly

Expect 10-20% improvement in months 2-3 after implementing the playbook above. 30-40% at months 4-6.

For tooling: Otterly.ai ($29-$989/mo) tracks citations across platforms. Semrush has an AI toolkit. HubSpot has a free AEO grader for a basic assessment. But a spreadsheet and monthly discipline will get you 80% of the way there.

The bottom line

GEO isn't optional if you care about visibility. AI search is eating traditional search, and the rules are different. Keyword density hurts you. Inline citations help you. Structure matters more than length. Freshness matters more than authority theater.

The good news: the highest-impact changes are low-effort. Inline citations, statistics with attribution, named expert quotes, clean heading structure, and an llms.txt file. You can implement most of this in a week.

The businesses that move first will compound their advantage. AI citation patterns self-reinforce — once an AI system learns to cite you as authoritative, it raises the bar for competitors to displace you.

Don't wait for this to become obvious. By then, the positions will be taken.