monitoraeo
Glossary · Concepts

What is llms.txt?

llms.txt is a markdown file at /llms.txt that summarises a website for AI crawlers and LLM-based agents. It tells them what the site is, who it's for, and which pages are authoritative — similar in role to robots.txt for traditional search engines, but with curated editorial content instead of crawl rules.

Definition

llms.txt is an emerging web standard proposed by Jeremy Howard (Answer.AI) and codified at llmstxt.org. It's a plain-text markdown file served at the root of a domain (e.g. yoursite.com/llms.txt) that gives AI agents a fast, structured summary of the site: title, one-paragraph description, and a curated list of authoritative pages grouped by category. Designed for LLM consumption, not human reading — though it's perfectly readable either way.

What an llms.txt file looks like

The minimal spec is four sections. Here's a working example from our own llms.txt (abbreviated):

# monitoraeo

> AI Answer Engine Optimisation (AEO) and Generative Engine Optimisation (GEO) audits.
> We measure how often Claude, ChatGPT, Perplexity, Gemini and Google AI Overviews
> name a brand, cite its domain, and recommend it in buyer-facing answers.

## Key concepts
- [What is AEO?](https://www.monitoraeo.com/what-is-aeo): Answer Engine Optimisation — the practice of getting your brand named...
- [What is GEO?](https://www.monitoraeo.com/what-is-geo): Generative Engine Optimisation — the technical layer...

## Product
- [How it works](https://www.monitoraeo.com/how-it-works): A monitoraeo audit takes a domain...
- [Audit product](https://www.monitoraeo.com/product/audit): One-off diagnostic across all 5 AI engines.

## Pricing
- [Free preview](https://www.monitoraeo.com/#preview): 8 buyer-facing questions...
- [Two Engine Audit](https://www.monitoraeo.com/pricing): $29 one-off...

The four required pieces, in order:

  1. H1 — the site name (one line, no description)
  2. Blockquote — a one-paragraph description of what the site is and who it's for
  3. H2 sections grouping links — typically "Key concepts", "Product", "Pricing", "Documentation", etc.
  4. Optional H2 — for less-critical pages an LLM might want but shouldn't prioritise

That's the whole spec. There's no XML schema to validate against, no rigid metadata fields. The discipline is editorial: pick the 10–30 pages that actually represent your site, write a one-line description of each, group them by purpose.

llms.txt vs robots.txt vs sitemap.xml

All three are root-level files that talk to crawlers. They solve different problems and should all be published — they don't replace each other.

What robots.txt sitemap.xml llms.txt
FormatPlain text rulesXML URL listMarkdown editorial
AudienceAll crawlersSearch engine crawlersLLM-based agents + AI crawlers
Content styleAllow/disallow rulesEvery URL + lastmodCurated summary + key links
Optimises forAccess controlDiscoverabilityComprehension
Typical length10–30 lines100–10,000+ URLs30–100 lines
Should you publish?YesYesYes

Do AI engines actually read it?

Honest answer: mixed adoption, but trending up fast. Status as of mid-2026:

The publishing cost is one-time ~10 minutes and zero ongoing. The downside is none. The upside compounds as adoption grows. There's no defensible reason not to publish one.

How to publish your llms.txt

Three steps:

1. Write the file. Start from the spec at llmstxt.org or use our example above as a template. Keep it under 100 lines. Include only your authoritative pages — the ones you'd want quoted by an AI summarising your site. Skip blog posts, support FAQs, anything ephemeral.

2. Serve it at the root. Publish at yoursite.com/llms.txt with content-type text/markdown; charset=utf-8 (preferred) or text/plain. Static site generators (Hugo, Astro, Next.js static export) handle this by dropping a file in public/. For dynamic sites, expose a single route — it's just text, no templating needed.

3. Reference it from robots.txt. Add a comment line so AI crawlers that read robots first know your llms.txt exists:

# AI summary: https://yoursite.com/llms.txt
Sitemap: https://yoursite.com/sitemap.xml

Common mistakes

How monitoraeo uses llms.txt

We publish our own at /llms.txt and check whether yours exists as one of the 15 technical foundations in every paid audit. Sites that publish a well-formed llms.txt consistently see higher visibility scores in Claude and Perplexity within 2–4 weeks of publishing — the engines that confirm parsing it. See the full methodology →

Audit your site → More AEO concepts

Related concepts

Frequently asked about llms.txt

Do AI engines actually read llms.txt?

Mixed adoption but trending up. Anthropic and Perplexity confirmed reading it. OpenAI hasn't formally announced support. Google AI products don't fetch it as of mid-2026. Given the publishing cost is one-time ~10 minutes and zero ongoing, the downside is none — worth publishing today.

How is llms.txt different from robots.txt?

robots.txt gates crawler access via allow/disallow rules. llms.txt provides editorial summary content for LLM-based agents. Different jobs — both should be published.

How is llms.txt different from sitemap.xml?

Sitemap = every URL in machine-readable XML (completeness). llms.txt = 10–30 curated pages with editorial descriptions (comprehension). Both should exist.

What goes in an llms.txt file?

Four sections: H1 with site name, blockquote with one-paragraph description, one or more H2-grouped link sections, optional "Optional" H2 for less-critical pages. Each link is a markdown list item with URL + short description. Typically 30–100 lines total.

Where do I publish llms.txt?

At yoursite.com/llms.txt with content-type text/markdown; charset=utf-8 (preferred) or text/plain. Reference it from robots.txt with a comment line so crawlers that read robots first know it exists.

How often should I update llms.txt?

When site structure changes — new top-level sections, renamed pages, deprecated features. Not every blog post. Monthly review + quarterly updates is the right cadence for most sites.