monitoraeo
Guide · ChatGPT Search

How to get cited by ChatGPT search: a practical playbook

ChatGPT search cites a small set of pages per answer — typically 3 to 8 — and the same handful of domains dominate most categories (Reddit, G2, Capterra, Wikipedia and the brand's own site). Getting cited isn't about ranking #1 on Google; it's about being the cleanest, most extractable source for a specific buyer question. This page walks through the practical changes that move the needle, in priority order.

The 8 steps, in priority order

1

Audit your current ChatGPT citation footprint

Before changing anything, measure where you stand. Run a brand-aware query set against ChatGPT search and record three numbers: visibility rate (how often your brand is mentioned at all), citation rate (how often your domain appears as a numbered source), and competitor share-of-voice. Without this baseline you can't tell which of the changes below actually worked.

Tip — A monitoraeo paid audit runs 40 brand-aware questions across 5 AI engines (8 in the free preview) and returns the visibility rate, citation rate, and cited-domain leaderboard per engine.
2

Unblock GPTBot and OAI-SearchBot in robots.txt

ChatGPT search uses two crawlers: GPTBot for training and OAI-SearchBot for live retrieval. If either is blocked in robots.txt, you cannot be cited. Check yourdomain.com/robots.txt and confirm neither User-agent is disallowed. Also confirm your CDN or WAF (Cloudflare, AWS WAF) isn't silently rate-limiting them. This is the single most common reason brands disappear from ChatGPT citations entirely.

Tip — Add an explicit Allow: / line for both bots to make intent unambiguous.
3

Server-render the content you want cited

ChatGPT's retrieval crawler reads the initial HTML response. Content injected by client-side JavaScript after page load is invisible to it. If your stats, comparison tables or feature lists only appear after React hydrates, they won't be extracted. Move the cite-worthy content into the SSR or static HTML payload. Test with curl -A 'OAI-SearchBot' yourdomain.com and confirm the text you want quoted is in the raw response.

Tip — Next.js: use generateStaticParams or server components. Avoid useEffect-only data fetching for cite-worthy facts.
4

Publish FAQPage and Article JSON-LD on every page

ChatGPT search disproportionately cites pages with valid FAQPage, Article and HowTo schema because the structured Q&A pairs map directly to the way users phrase queries. Add JSON-LD blocks where the schema genuinely matches the content — fake FAQ schema gets penalised. Also add Organization schema sitewide with sameAs links to your LinkedIn, Crunchbase and Wikidata entries to disambiguate your entity.

Tip — Validate every block with Google's Rich Results Test before deploying.
5

Write in extractable chunks — 2 to 3 sentences per point

ChatGPT extracts short, self-contained passages it can quote without surrounding context. Long flowing paragraphs are skipped. Restructure key pages so each claim sits in a 2-3 sentence chunk with a clear subject, a concrete fact and a number where possible. Use descriptive H2s and H3s that match how buyers actually phrase the question ("What is X", "How much does X cost", "X vs Y").

Tip — Read your page aloud — if a chunk can't stand alone as a quote, ChatGPT won't use it as one.
6

Earn third-party citations on the domains ChatGPT trusts

Across most categories, the top-cited domains in ChatGPT search are Reddit, G2, Capterra, Wikipedia, TechCrunch and the relevant trade publications. Your own site usually appears once per answer; the other 4-7 citations come from these third parties. Get listed on G2 and Capterra (B2B SaaS), participate in genuine Reddit threads in your category, ensure your Wikipedia entity is correct, and pitch the trade pubs ChatGPT already cites for your topic.

Tip — Run a ChatGPT audit and note which non-brand domains appear in answers about your category — that's your target media list.
7

Publish an llms.txt file at your root

llms.txt is an emerging convention (similar to robots.txt or sitemap.xml) that tells LLM-based search engines which pages on your site are canonical and what they cover. While not all engines honour it yet, it's cheap to publish and we've seen citation lift on the engines that do consume it. Point to your highest-quality, most extractable pages — comparison guides, pricing, definitive how-tos. See our llms.txt examples for working templates.

Tip — Keep it under 50 entries; quality over quantity. Markdown bullet list with absolute URLs.
8

Re-measure every 4 to 6 weeks

ChatGPT updates its retrieval pipeline and index continuously. A change that worked in March may not move citations in June, and competitors are making the same changes you are. Schedule a recurring audit so you can see which interventions actually shifted citation rate, catch regressions early (a redeploy that breaks SSR is the classic), and adapt as the cited-domain mix in your category shifts.

Tip — monitoraeo's monitoring tier re-runs the same query set monthly and trends visibility, citation rate and competitor share-of-voice.

What "being cited by ChatGPT" actually means

ChatGPT search returns an answer plus a numbered list of source links — typically 3 to 8 per response. A "citation" means your domain appears in that numbered list, and ideally your specific URL is the one ChatGPT quotes from in the answer text. This is distinct from being mentioned (your brand name appears in the prose but no link is given) and distinct from a Google AI Overview citation, which uses a different retrieval pipeline.

The bar for citation is different from ranking on Google. ChatGPT doesn't reward backlinks or domain authority the way classical SEO does. It rewards extractable content on a crawlable page from a domain it has seen cited before in similar contexts. That's why a clean, well-structured page on a smaller site can outrank a Forbes article that buries the answer 12 paragraphs deep.

Why ChatGPT cites Reddit, G2 and Wikipedia so often

Across nearly every category we audit, the top-cited domains in ChatGPT search are remarkably consistent: Reddit, G2, Capterra, Wikipedia, TechCrunch and the relevant trade publications. The reason is structural — these sites publish the cleanest, most directly-quotable answers to the questions users ask ChatGPT. Reddit threads are literally Q&A pairs. G2 and Capterra pages are structured comparison data. Wikipedia is canonical entity definition.

The practical implication: getting cited isn't purely about optimising your own site. It's about ensuring you have a presence on the third-party domains ChatGPT already trusts in your category. A single accurate G2 listing or a well-researched Reddit comment often drives more ChatGPT citation lift than a month of on-site changes.

How ChatGPT search differs from Google AI Overviews

Google's AI Overview renders on roughly 25-48% of queries (varying by category and intent) and pulls from the classical Google index, biased heavily toward existing top-10 organic results. ChatGPT search runs on a different retrieval stack, weighs domain authority less heavily, and renders on every query by definition. The overlap in cited domains is real but partial — winning AIO does not guarantee winning ChatGPT, and vice versa.

This is why measuring both matters. A brand might be cited in 60% of ChatGPT answers in their category and 10% of AI Overviews, or the reverse. monitoraeo's free industry rankings show the split per engine for 30+ categories.

Common mistakes that kill ChatGPT citation rate

Four mistakes account for most of the "why aren't we cited?" cases we see. First: a Cloudflare WAF rule that blocks OAI-SearchBot as a "bad bot" — the brand has no idea, the crawler can't reach the page, citations stop. Second: a JS-only React or Vue site where the cite-worthy content never appears in the initial HTML. Third: thin pages with no FAQPage schema and walls of marketing prose instead of extractable chunks. Fourth: an Organization entity that's ambiguous — no sameAs to Wikidata, no Wikipedia entry, no consistent brand string across the web.

None of these are hard to fix. They're just rarely caught without a structured audit. A paid monitoraeo audit runs a 15-check GEO scan that flags all four classes of issue with the specific URL and fix.

Related guides