monitoraeo
Glossary · Tactics

Structured data for AI engines

Structured data (JSON-LD) tells AI engines exactly what your page is about, in a format they can parse without ambiguity. The schemas that move the needle for AI citations are different from the schemas that win classical SEO rich results — and getting the wrong ones wastes implementation effort.

The 6 schema types that matter most for AI

  1. FAQPage — highest leverage. AI engines (especially Google AI Overview and ChatGPT) preferentially extract from FAQPage-marked content. Wrap every Q&A block on the site.
  2. Organization — the canonical entity signal. Include name, url, logo, description, and sameAs links to verifiable profiles (LinkedIn, Crunchbase, Wikipedia, X). Publish sitewide via the layout.
  3. Article — for any editorial content. Include headline, datePublished, dateModified, author (with Person schema), and publisher reference. AI engines weight recency heavily — dateModified directly affects citation rate.
  4. Product / Service — for commercial pages. Include name, description, brand reference, and Offer with price + availability. Helps AI engines extract pricing accurately (a top hallucination vector when missing).
  5. Person — for author bylines on editorial content. Include name, jobTitle, and sameAs links to LinkedIn/X/GitHub. Anthropic's Claude weights named authorship explicitly.
  6. BreadcrumbList — for any page deeper than the homepage. Cheap, easy, helps AI engines build the hierarchy of your site.

Schema types that DON'T help AI (and may waste effort)

  • Recipe, HowTo (most cases) — unless your content is genuinely procedural, these don't help. Recipe schema on a non-recipe page is sometimes worse than no schema.
  • VideoObject — only helps if you have actual video content. Marking a page with a single embedded YouTube as a Video page is overreach.
  • Review / AggregateRating — these help classical SEO rich results but AI engines treat self-published ratings with suspicion. Third-party review schema (from G2, Capterra) helps more than your own.
  • SoftwareApplication — useful in some cases, but AI engines often ignore software-specific schema in favour of Organization + Product. Lower leverage than the top 6.

How to validate your schema

Two tools, both free. Google's Rich Results Test (search.google.com/test/rich-results) — paste any URL, see what schemas Google parsed and which rich result types you're eligible for. Schema.org's validator (validator.schema.org) — pure spec compliance, catches structural errors. Run both on every new content page before considering it shipped.

Common mistake: schema validates structurally but the entity references (publisher @id, author @id) point to entities that aren't defined elsewhere. Use named entities (@id) consistently across pages so the entity graph resolves cleanly.

Common mistakes

  • Putting schema in the rendered HTML but not in the initial server response — AI crawlers often don't execute JS reliably; schema injected client-side is invisible to many.
  • Stale dateModified — bumping dateModified without actually modifying content is detectable and de-weighted. Bump only on real revisions.
  • Duplicate Organization entries on every page with different @ids — pick one @id (e.g. https://yoursite.com/#org) and reference it sitewide.
  • Marking a sales page as Article — Article schema implies editorial content. Promotional pages should be Product/Service/WebPage, not Article.
  • FAQPage with too few Q&As — under 3 Qs, FAQPage often gets ignored. Aim for 5+ substantive Q&As per FAQ block.
Run a free preview → Back to glossary

Related concepts

Frequently asked

Which schema type matters most for AI citations?

FAQPage by a clear margin. Pages with FAQPage JSON-LD get cited at materially higher rates by Google AI Overview and ChatGPT for informational queries. Make this the first schema you add.

Do I need to add schema to every page?

Add Organization + BreadcrumbList sitewide via the layout. Article + FAQPage on every editorial page. Product/Service on commercial pages. Person on author bylines. Don't add schema types that don't apply — irrelevant schema is sometimes worse than no schema.

What's the difference between JSON-LD and other schema formats?

JSON-LD is the recommended format — schema as a JSON block in <script type=application/ld+json>. Microdata and RDFa work but are harder to maintain. AI engines parse JSON-LD most reliably.

Will adding schema improve my SEO ranking?

Sometimes — schema can unlock rich-result eligibility which affects click-through rate. But schema isn't a direct ranking factor in classical SEO. For AI engines, schema is a more direct citation signal.

How do I know if my schema is working?

Use Google's Rich Results Test to verify it parses. Then monitor your AI visibility + citation rate before and after shipping new schema. Citation rate typically moves 7–14 days after shipping FAQPage on a content library.