ChatGPT citation analysis
ChatGPT citation analysis is the practice of measuring how often, and on which queries, your domain is cited as a source in ChatGPT answers. It is the diagnostic step before any AEO work: without it you are guessing which content needs the most help.
What a citation analysis measures
A useful citation analysis tracks four things per query. (1) Citation rate: percentage of answers that include your domain as a source. (2) Citation rank: when cited, what position is your link in the source list (first, third, eighth). (3) Co-citation: which other domains appear alongside yours, those are your real AI-era competitors. (4) Page-level breakdown: which specific URLs on your site are being pulled, not just the apex domain.
Aggregating these across a representative query set (40 to 200 buyer-intent questions in your category) gives you a baseline. Repeating monthly gives you the trend. Without the trend a single snapshot is hard to act on.
How to build a citation analysis from scratch
Three components. (1) A query set that reflects how real buyers ask, not how you describe your product. Pull queries from your Google Search Console + GA4 internal site search + sales call transcripts. Aim for 40 to 100 queries across informational, commercial-investigation, and brand-comparison intents. (2) A way to script the queries through ChatGPT. The official API exposes web search through the responses endpoint. You can also run a headless browser, slower but useful for capturing what users actually see. (3) A consistent parser that extracts citations into a structured format (query, answer text, list of cited URLs with positions).
Run the set monthly. Each run takes 10 to 30 minutes of API time depending on query count. Store results so the month-over-month diff is your real product.
What to do with the data
Three actionable cuts. (1) Queries where you are cited at low rank: your page is in the consideration set but losing to a better answer. Usually fixable by restructuring that specific page into a clearer extractive format. (2) Queries where competitors are cited and you are not at all: you need to publish or earn coverage on those topics. Look at which competitor pages are cited and reverse-engineer the structure. (3) Queries where a third-party site (review aggregator, Wikipedia, Reddit) is cited above all brand sites: that is your PR target, getting mentioned on those sites moves your brand's citation rate faster than any on-site change.
Avoid the trap of trying to fix every page at once. Pick the top 10 highest-traffic-potential queries each month and ship targeted changes against those.
Tools and pitfalls
You can do this manually with a spreadsheet and 2 hours per week, with browser extensions that scrape ChatGPT's UI, or with a dedicated tool that runs queries via API and aggregates the data over time. The DIY route works at small scale but breaks down past 50 queries per month because the variance between runs is high (ChatGPT's web search is non-deterministic, the same query can produce different citation lists). Running each query 3 to 5 times and averaging the citation rate gives a more stable signal.
Common pitfalls: querying only branded terms (you will always look good), running only at one time of day (CTR and result freshness shift across the week), conflating Bing rank with ChatGPT citation rank (Bing ranking is one input among many to ChatGPT's re-ranker, not the only one).
Related concepts
Frequently asked
How many queries do I need for a useful baseline?
40 to 100 buyer-intent queries is the practical minimum. Below 40 the variance between runs swamps the signal. Past 200 you get diminishing returns on the broader picture but more confidence on category-specific cuts.
Can I just look at Google Search Console for this?
No. Search Console shows Google web search impressions, not ChatGPT citations. There is no first-party analytics for being cited in ChatGPT, you have to run the queries yourself or use a tool that does.
How often should I run the analysis?
Monthly is the right cadence for most teams. ChatGPT's index refreshes constantly but the practical impact of any single content change usually plays out over 2 to 4 weeks, so monthly captures the signal without drowning in noise.
Why do I get different citations when I run the same query twice?
ChatGPT's web search is non-deterministic. Result mix shifts between runs based on freshness, model temperature, and ranking ties. Run each query 3 to 5 times and average to get a stable citation rate.
Should I optimise for being cited or for being named?
Both, but in this order. Citation rate moves first and is more controllable through on-site changes (schema, structure, recency). Being named follows once the AI has enough trust signal to put your brand in the answer prose, that typically takes 2 to 3 weeks longer.