How to get cited by ChatGPT search: a practical playbook
ChatGPT search cites a small set of pages per answer — typically 3 to 8 — and the same handful of domains dominate most categories (Reddit, G2, Capterra, Wikipedia and the brand's own site). Getting cited isn't about ranking #1 on Google; it's about being the cleanest, most extractable source for a specific buyer question. This page walks through the practical changes that move the needle, in priority order.
The 8 steps, in priority order
Audit your current ChatGPT citation footprint
Before changing anything, measure where you stand. Run a brand-aware query set against ChatGPT search and record three numbers: visibility rate (how often your brand is mentioned at all), citation rate (how often your domain appears as a numbered source), and competitor share-of-voice. Without this baseline you can't tell which of the changes below actually worked.
Unblock GPTBot and OAI-SearchBot in robots.txt
ChatGPT search uses two crawlers: GPTBot for training and OAI-SearchBot for live retrieval. If either is blocked in robots.txt, you cannot be cited. Check yourdomain.com/robots.txt and confirm neither User-agent is disallowed. Also confirm your CDN or WAF (Cloudflare, AWS WAF) isn't silently rate-limiting them. This is the single most common reason brands disappear from ChatGPT citations entirely.
Allow: / line for both bots to make intent unambiguous.Server-render the content you want cited
ChatGPT's retrieval crawler reads the initial HTML response. Content injected by client-side JavaScript after page load is invisible to it. If your stats, comparison tables or feature lists only appear after React hydrates, they won't be extracted. Move the cite-worthy content into the SSR or static HTML payload. Test with curl -A 'OAI-SearchBot' yourdomain.com and confirm the text you want quoted is in the raw response.
generateStaticParams or server components. Avoid useEffect-only data fetching for cite-worthy facts.Publish FAQPage and Article JSON-LD on every page
ChatGPT search disproportionately cites pages with valid FAQPage, Article and HowTo schema because the structured Q&A pairs map directly to the way users phrase queries. Add JSON-LD blocks where the schema genuinely matches the content — fake FAQ schema gets penalised. Also add Organization schema sitewide with sameAs links to your LinkedIn, Crunchbase and Wikidata entries to disambiguate your entity.
Write in extractable chunks — 2 to 3 sentences per point
ChatGPT extracts short, self-contained passages it can quote without surrounding context. Long flowing paragraphs are skipped. Restructure key pages so each claim sits in a 2-3 sentence chunk with a clear subject, a concrete fact and a number where possible. Use descriptive H2s and H3s that match how buyers actually phrase the question ("What is X", "How much does X cost", "X vs Y").
Earn third-party citations on the domains ChatGPT trusts
Across most categories, the top-cited domains in ChatGPT search are Reddit, G2, Capterra, Wikipedia, TechCrunch and the relevant trade publications. Your own site usually appears once per answer; the other 4-7 citations come from these third parties. Get listed on G2 and Capterra (B2B SaaS), participate in genuine Reddit threads in your category, ensure your Wikipedia entity is correct, and pitch the trade pubs ChatGPT already cites for your topic.
Publish an llms.txt file at your root
llms.txt is an emerging convention (similar to robots.txt or sitemap.xml) that tells LLM-based search engines which pages on your site are canonical and what they cover. While not all engines honour it yet, it's cheap to publish and we've seen citation lift on the engines that do consume it. Point to your highest-quality, most extractable pages — comparison guides, pricing, definitive how-tos. See our llms.txt examples for working templates.
Re-measure every 4 to 6 weeks
ChatGPT updates its retrieval pipeline and index continuously. A change that worked in March may not move citations in June, and competitors are making the same changes you are. Schedule a recurring audit so you can see which interventions actually shifted citation rate, catch regressions early (a redeploy that breaks SSR is the classic), and adapt as the cited-domain mix in your category shifts.
What "being cited by ChatGPT" actually means
ChatGPT search returns an answer plus a numbered list of source links — typically 3 to 8 per response. A "citation" means your domain appears in that numbered list, and ideally your specific URL is the one ChatGPT quotes from in the answer text. This is distinct from being mentioned (your brand name appears in the prose but no link is given) and distinct from a Google AI Overview citation, which uses a different retrieval pipeline.
The bar for citation is different from ranking on Google. ChatGPT doesn't reward backlinks or domain authority the way classical SEO does. It rewards extractable content on a crawlable page from a domain it has seen cited before in similar contexts. That's why a clean, well-structured page on a smaller site can outrank a Forbes article that buries the answer 12 paragraphs deep.
Why ChatGPT cites Reddit, G2 and Wikipedia so often
Across nearly every category we audit, the top-cited domains in ChatGPT search are remarkably consistent: Reddit, G2, Capterra, Wikipedia, TechCrunch and the relevant trade publications. The reason is structural — these sites publish the cleanest, most directly-quotable answers to the questions users ask ChatGPT. Reddit threads are literally Q&A pairs. G2 and Capterra pages are structured comparison data. Wikipedia is canonical entity definition.
The practical implication: getting cited isn't purely about optimising your own site. It's about ensuring you have a presence on the third-party domains ChatGPT already trusts in your category. A single accurate G2 listing or a well-researched Reddit comment often drives more ChatGPT citation lift than a month of on-site changes.
How ChatGPT search differs from Google AI Overviews
Google's AI Overview renders on roughly 25-48% of queries (varying by category and intent) and pulls from the classical Google index, biased heavily toward existing top-10 organic results. ChatGPT search runs on a different retrieval stack, weighs domain authority less heavily, and renders on every query by definition. The overlap in cited domains is real but partial — winning AIO does not guarantee winning ChatGPT, and vice versa.
This is why measuring both matters. A brand might be cited in 60% of ChatGPT answers in their category and 10% of AI Overviews, or the reverse. monitoraeo's free industry rankings show the split per engine for 30+ categories.
Common mistakes that kill ChatGPT citation rate
Four mistakes account for most of the "why aren't we cited?" cases we see. First: a Cloudflare WAF rule that blocks OAI-SearchBot as a "bad bot" — the brand has no idea, the crawler can't reach the page, citations stop. Second: a JS-only React or Vue site where the cite-worthy content never appears in the initial HTML. Third: thin pages with no FAQPage schema and walls of marketing prose instead of extractable chunks. Fourth: an Organization entity that's ambiguous — no sameAs to Wikidata, no Wikipedia entry, no consistent brand string across the web.
None of these are hard to fix. They're just rarely caught without a structured audit. A paid monitoraeo audit runs a 15-check GEO scan that flags all four classes of issue with the specific URL and fix.