Structured data for AI engines
Structured data (JSON-LD) tells AI engines exactly what your page is about, in a format they can parse without ambiguity. The schemas that move the needle for AI citations are different from the schemas that win classical SEO rich results — and getting the wrong ones wastes implementation effort.
The 6 schema types that matter most for AI
- FAQPage — highest leverage. AI engines (especially Google AI Overview and ChatGPT) preferentially extract from FAQPage-marked content. Wrap every Q&A block on the site.
- Organization — the canonical entity signal. Include name, url, logo, description, and
sameAslinks to verifiable profiles (LinkedIn, Crunchbase, Wikipedia, X). Publish sitewide via the layout. - Article — for any editorial content. Include
headline,datePublished,dateModified,author(with Person schema), andpublisherreference. AI engines weight recency heavily — dateModified directly affects citation rate. - Product / Service — for commercial pages. Include
name,description,brandreference, andOfferwith price + availability. Helps AI engines extract pricing accurately (a top hallucination vector when missing). - Person — for author bylines on editorial content. Include
name,jobTitle, andsameAslinks to LinkedIn/X/GitHub. Anthropic's Claude weights named authorship explicitly. - BreadcrumbList — for any page deeper than the homepage. Cheap, easy, helps AI engines build the hierarchy of your site.
Schema types that DON'T help AI (and may waste effort)
- Recipe, HowTo (most cases) — unless your content is genuinely procedural, these don't help. Recipe schema on a non-recipe page is sometimes worse than no schema.
- VideoObject — only helps if you have actual video content. Marking a page with a single embedded YouTube as a Video page is overreach.
- Review / AggregateRating — these help classical SEO rich results but AI engines treat self-published ratings with suspicion. Third-party review schema (from G2, Capterra) helps more than your own.
- SoftwareApplication — useful in some cases, but AI engines often ignore software-specific schema in favour of Organization + Product. Lower leverage than the top 6.
How to validate your schema
Two tools, both free. Google's Rich Results Test (search.google.com/test/rich-results) — paste any URL, see what schemas Google parsed and which rich result types you're eligible for. Schema.org's validator (validator.schema.org) — pure spec compliance, catches structural errors. Run both on every new content page before considering it shipped.
Common mistake: schema validates structurally but the entity references (publisher @id, author @id) point to entities that aren't defined elsewhere. Use named entities (@id) consistently across pages so the entity graph resolves cleanly.
Common mistakes
- Putting schema in the rendered HTML but not in the initial server response — AI crawlers often don't execute JS reliably; schema injected client-side is invisible to many.
- Stale dateModified — bumping dateModified without actually modifying content is detectable and de-weighted. Bump only on real revisions.
- Duplicate Organization entries on every page with different @ids — pick one @id (e.g.
https://yoursite.com/#org) and reference it sitewide. - Marking a sales page as Article — Article schema implies editorial content. Promotional pages should be Product/Service/WebPage, not Article.
- FAQPage with too few Q&As — under 3 Qs, FAQPage often gets ignored. Aim for 5+ substantive Q&As per FAQ block.
Related concepts
Frequently asked
Which schema type matters most for AI citations?
FAQPage by a clear margin. Pages with FAQPage JSON-LD get cited at materially higher rates by Google AI Overview and ChatGPT for informational queries. Make this the first schema you add.
Do I need to add schema to every page?
Add Organization + BreadcrumbList sitewide via the layout. Article + FAQPage on every editorial page. Product/Service on commercial pages. Person on author bylines. Don't add schema types that don't apply — irrelevant schema is sometimes worse than no schema.
What's the difference between JSON-LD and other schema formats?
JSON-LD is the recommended format — schema as a JSON block in <script type=application/ld+json>. Microdata and RDFa work but are harder to maintain. AI engines parse JSON-LD most reliably.
Will adding schema improve my SEO ranking?
Sometimes — schema can unlock rich-result eligibility which affects click-through rate. But schema isn't a direct ranking factor in classical SEO. For AI engines, schema is a more direct citation signal.
How do I know if my schema is working?
Use Google's Rich Results Test to verify it parses. Then monitor your AI visibility + citation rate before and after shipping new schema. Citation rate typically moves 7–14 days after shipping FAQPage on a content library.