Skip to main content

Surfient module · Distribution

Publish llms.txt, ai-sitemap.xml, and products.ndjson — always fresh, always valid

The plumbing AI engines probe for, generated for every Shopify store, refreshed whenever your catalog changes, and hosted on your own domain — not ours.

  • Surfient generates and hosts the four files AI engines actually fetch: llms.txt, llms-full.txt, ai-sitemap.xml, and products.ndjson — on your own domain, at conventional paths.
  • Files regenerate automatically on every product, collection, or policy update via Shopify webhooks — so the feed never lags the storefront by more than a few minutes.
  • Every generated file is validated against the published spec before it ships — no silent breakage when the ndjson format tightens or llms.txt gains a new section.

AI-Ready Files

Regenerated on every publish

5 / 5 live
surfient · regen
$regenllms.txt· 8.2 kbok
$regenllms-full.txt· 142 kbok
$regenai-sitemap.xml· 54 kbok
$regenproducts.ndjson· 2.1 MBok
$regenrss.xml· 12 kbok
5/5 published · 1.2s · next regen on publish event
0

llms.txt

0

sitemap

0

ndjson

The problem

Your storefront isn't the problem — your feeds are

Modern AI engines don't crawl the web the way Google did. They probe for a handful of conventional endpoints, ingest the structured data, and build their answer from that. A store without those endpoints is invisible the way a restaurant without a Google Business profile is invisible on Maps.

  • 4

    files AI engines probe for before they'll trust a site as a structured source

    llms.txt, llms-full.txt, ai-sitemap.xml, and a JSON or NDJSON product feed at a predictable path.

  • 0.7%

    of Shopify stores publish even one of those four files today

    We crawled 410,000 Shopify-hosted stores in January 2026. Three thousand had an llms.txt. Almost none had a working NDJSON feed.

  • 11 min

    median staleness for stores using a manual feed export

    The merchants who do publish a feed typically regenerate it daily or weekly — the AI index catches that drift, and your bestsellers go out of sync.

How it works

Four files, zero drift

We generate each file from live Shopify data, regenerate on every webhook event, and serve them from your domain so AI engines don't have to follow a redirect chain.

  1. Generate llms.txt

    Surfient composes an llms.txt that describes your store's purpose, highlights your best-of catalog, lists your policies and contact endpoints, and points to the structured feeds below. The file follows the llms.txt 0.3 spec with the commerce extension — so ChatGPT, Claude, and Perplexity parse it without guesses.

  2. Generate ai-sitemap.xml

    A model-optimised sitemap that lists products, collections, articles, and policies with last-modified timestamps, canonical URLs, and an x-surfient-citability score. AI crawlers using it can pick the freshest, most citable pages first and skip the thin ones.

  3. Generate products.ndjson

    One product per line, each line a valid JSON object with title, handle, description, up to 8 media URLs, spec table, pricing, availability, review aggregate, and canonical URL. NDJSON parses stream-style, so an engine can ingest a 50,000-SKU catalog without loading it into memory.

  4. Host on your domain + refresh on webhooks

    Every file is served from your own domain at /llms.txt, /llms-full.txt, /ai-sitemap.xml, and /products.ndjson. We subscribe to the Shopify product/update, collection/update, and theme/publish webhooks — the moment something changes, the relevant file rebuilds and invalidates cache.

Inside the app

What you’ll see after install

Every number a Shopify merchant running Surfient AI-Ready Files tracks in one glance — live from the Surfient admin. AI engine splits, revenue lift, and the exact state of your catalog across ChatGPT, Perplexity, Google AI Overviews, Claude, Gemini, and Copilot.

Capabilities

What the generator does for you

The surface area is deliberately narrow — publish the four files, validate them, keep them fresh. Everything below is in service of that.

  • llms.txt with the commerce extension

    Not every llms.txt in the wild is valid. Surfient writes to the 0.3 spec plus the commerce extension so assistants treat your store as a product catalog source, not a generic blog. Includes store metadata, preferred contact, citation policy, and the list of feeds.

  • llms-full.txt with quotable brand facts

    A longer companion file with brand-voice-approved facts an engine can quote verbatim. Policies, guarantees, shipping regions, craftsmanship claims — everything you want ChatGPT to know and nothing you don't. Edit the facts in the admin; the file regenerates in under a minute.

  • ai-sitemap.xml with citability hints

    Standard sitemap fields plus x-surfient-citability, a float between 0 and 1 the audit engine computes for each URL. Crawlers that respect the hint (we've tested Perplexity, ClaudeBot, and our own partner crawlers) pull the highest-scoring pages first.

  • products.ndjson stream-friendly feed

    NDJSON because AI engines ingest feeds at streaming scale. Each line is independent, so a 50,000-SKU catalog is trivially chunked. Fields match the Product schema an engine would extract anyway, which means no field mapping on ingestion.

  • Webhook-driven freshness

    Every write to a Shopify resource fires the matching Surfient regenerator. Product updated? products.ndjson rebuilds in under 60 seconds. Policy page edited? llms-full.txt refreshes. Delete a SKU? It's gone from every feed before your replatform team has reloaded the tab.

  • Spec validation on every build

    We validate the generated llms.txt against the 0.3 schema, the sitemap XML against the sitemap.org schema, and each NDJSON line against our Zod product schema. A build that would publish malformed output fails and pages a human — it never silently ships a broken file to an engine.

  • Served from your domain, not ours

    The files live at yourstore.com/llms.txt, not surfient.com/yourstore/llms.txt. AI engines treat them as first-party statements from your brand — no redirect chains, no trust transfer, no CORS gotchas when a model probes the endpoint.

  • Version history and rollback

    Every regeneration is archived. If a bad product description leaks into llms-full.txt, you can diff against yesterday's build and roll back the file in a single click while you fix the source.

Customer proof

Proof

We had an ai-sitemap within 40 minutes of installing. Two weeks later Perplexity was citing our gift-guide collection page by name. The only thing that had changed was that we existed in a format the model could read.
Lars Bergström · Founder, Nordfell Goods

40 min

install to first AI engine citation

FAQ

Questions, answered straight

  • Why not just rely on Shopify's built-in sitemap?

    Shopify's sitemap lists URLs — that's all. AI engines ingest URLs, but they also want structured data (product price, availability, spec), a policy on citation, and a hint about which URLs are most worth reading. ai-sitemap.xml and llms.txt together carry that extra signal. Shopify's sitemap stays exactly as-is; the AI-Ready files layer on top.

  • Will this conflict with my existing robots.txt or sitemap?

    No. llms.txt and ai-sitemap.xml live at separate paths. We add one line to robots.txt referencing the new sitemap entry — the standard Sitemap: directive — and nothing more. Your existing Google, Bing, and social crawler rules stay untouched.

  • Do AI engines actually read these files today?

    Yes. We've instrumented access logs across 180 Surfient customers and tracked named-agent fetches from GPTBot, ClaudeBot, PerplexityBot, and Google-Extended on /llms.txt, /ai-sitemap.xml, and /products.ndjson in the past 90 days. The median store sees 12 fetches a week from named AI crawlers on these paths alone.

  • How big can the products.ndjson feed get?

    We've tested up to 250,000 SKUs. NDJSON streams well — each line is independently parseable, so an engine can ingest at whatever chunk size it likes. The generator gzips responses, so a 50,000-SKU catalog transfers at around 8 MB over the wire.

  • Can I customise what goes in llms-full.txt?

    Yes. llms-full.txt is the one file you author directly — in Surfient's admin, under Brand Facts. You write the citation-ready sentences; the engine composes them with the site metadata. Changes publish within 60 seconds and are versioned.

  • What happens when a product is deleted from Shopify?

    The product/delete webhook fires, Surfient removes the row from products.ndjson and the URL from ai-sitemap.xml, invalidates the CDN cache, and logs the removal in your version history. Next time an AI engine fetches the feed, the SKU is gone — and so is the risk of a model citing a 404.

Spin up your AI-Ready files in under an hour

We'll generate your llms.txt, llms-full.txt, ai-sitemap.xml, and products.ndjson from your live Shopify catalog — validated, hosted on your domain, refreshed whenever your store changes.