I ranked #1 on Google yesterday — why am I not in ChatGPT?

Rank on Google and citation in ChatGPT are almost independent signals. Google rewards keyword relevance and link authority; assistants reward schema clarity, llms.txt presence, sentence-shaped content, and honest comparison pages. Some of your top-ranking Google content is too long, too brand-voice-heavy, and too thin on structured data to be quotable — which is the opposite of what gets cited.

How do I actually know if GPTBot is crawling me?

Grep your access logs for user-agent 'GPTBot', 'ClaudeBot', 'PerplexityBot', 'CCBot', 'GoogleOther' and 'Google-Extended'. If you see hits in the last 30 days, you are being crawled. If there's a gap of more than 45 days on any of them, there's a near-certainty your robots.txt is blocking them or your firewall / Cloudflare rules are challenging them with a CAPTCHA.

Is this just SEO work repackaged?

Crawl-layer work overlaps with SEO (~60%). Signal and narrative layers do not. A site that passed every traditional SEO audit in 2022 will fail about 8 of these 14 checks today, because the signals assistants reward weren't on any SEO rubric three years ago. Treat it as adjacent work — leverage the crawl foundations, then layer the signal and narrative fixes on top.

Why do I care about AI Overviews if I'm not in ChatGPT yet?

AI Overviews appears above the blue links on ~18% of commercial queries and on the ones with the highest purchase intent. A shopper who reads the Overview and sees three brands — none of which are you — will not scroll. The fix is the same fix: schema clarity, FAQPage, and clean quotable paragraphs. You get the AI Overviews win for free while working on ChatGPT.

How many prompts should my tracking panel cover?

For most Shopify stores, 20-40 prompts. Cover your top five category questions ("best X under $Y", "alternative to [competitor]", "X for small space"), your top five product-specific prompts, and 10-15 long-tail prompts pulled from your customer-service tickets and post-purchase surveys. Run the panel weekly, log which brands are named and in which position. Citation share is your KPI, not click count.

Why your Shopify store isn't in ChatGPT

You rank on Google. Your reviews are 4.7. Your PDPs convert. You type your buying question into ChatGPT, it gives you three stores, and none of them are yours. This post is the diagnostic we run when a merchant sends us that screenshot.

A 14-point diagnostic grid grouped into Crawl, Signal and Narrative columns. Each rung shows a colour-coded MISS, WARN or OK status, with a short reason. Aggregate score at the bottom shows 6 misses, 4 at-risk, 4 passing. — Figure 1 — The 14-point diagnostic grouped by category (Crawl, Signal, Narrative). The first pass on a typical absent-from-ChatGPT store surfaces 6 misses, 4 at-risk, 4 passing.

Three categories, in this exact order

Stores fail at three layers: crawl, signal, narrative. The order isn't decorative. Crawl failures silence every other signal you publish — if GPTBot can't reach the page, your perfect JSON-LD and your witty PDP lead never make it into the retrieval index. The merchants who fix this fastest are the ones who stop triaging by what feels fun to work on and start triaging by which layer is actually blocking them.

Layer 1 — Crawl (five checks)

1. llms.txt at the root

A missing /llms.txt is the single most common miss we find. The file is effectively a robots.txt for assistants — a curated list of canonical URLs and section headings a model should reach for when answering questions about your domain. It's 3 KB of work and it typically earns citations within a week.

2. GPTBot allowed in robots.txt

Many stores added User-agent: GPTBot · Disallow: / during the 2023 scraping panic and then forgot. Two and a half years later, the same robots.txt still blocks the crawler that most of your shoppers are now asking. Check it. If it's blocked, delete the block, redeploy, and wait 72 hours.

3. Sitemap complete

Your sitemap.xml is the anchor every assistant retrievers look at when the model cites you. Missing sitemaps, sitemaps with 404s, or sitemaps that stop at 500 URLs (Shopify's default paginates) are common sources of partial indexation. Submit sitemap_index.xml that references every page you care about.

4. Canonical tags

Duplicate PDPs from ?variant= query strings confuse assistants as much as they confuse Google. A valid rel="canonical" on every PDP that points to the clean URL is a free fix and it prevents the retriever from choosing a low-authority duplicate over your real product page.

5. HTML2 renderable pass

Assistants use a lighter-weight renderer than Chromium when they crawl. Content that only appears after a client-side JS hydrate — a common Shopify pattern for review widgets, size guides, and sustainability blocks — may not make it into the retrieval snapshot. If your reviews are a Judge.me widget with no SSR fallback, the 184 reviews you're proud of are invisible to the model.

Layer 2 — Signal (five checks)

6. Product + Offer JSON-LD

The baseline. If your PDPs don't emit a valid Product with a nested Offer and a price, assistants down-weight the page in their ranker. The full fix is: name, description, sku, gtin, brand, offers (price, priceCurrency, availability, url), aggregateRating, review.

7. FAQPage schema

FAQPage is the single most-lifted schema type in assistant answers, because each Question + Answer pair is already sentence-shaped. Five questions on every collection page and every bestseller PDP — ideally ones a real shopper would ask — moves citation share fast.

8. Review density

Models don't trust stores with 6 reviews and a 5.0 average. They quietly prefer stores with 150+ reviews in the 4.4–4.8 range. If you're below 50 reviews on a product you want cited, that's worth a dedicated email campaign before you tune anything else.

9. Freshness date

A dateModified older than 180 days signals staleness to retrievers who have recency heuristics. If you haven't touched the page in a year, the PDP copy is probably also stale. Fix it and update the field.

10. Brand name matches domain

schema:Organization with name = your storefront brand AND url = your domain. Stores whose brand renders as "Store 72" but whose domain is 72desks.com confuse the entity-resolution pass. Assistants can't confidently name a store they can't uniquely identify.

Layer 3 — Narrative (four checks)

11. PDP lead under 55 words

The first paragraph of your PDP is the one the model reads. If it buries the differentiator under paragraphs of brand voice, the retriever won't promote it. Rewrite so the first 50–55 words state: what the product is, who it's for, and the one specific fact you want quoted.

12. Honest comparison page

Stores that publish a yourbrand-vs-competitor page with real trade-offs (not a puff piece) get cited on comparison prompts disproportionately. Assistants reward the source that sounds like it's actually weighing options, not the one that sounds like marketing.

13. Reddit presence

If your brand name doesn't show up in r/ threads your shoppers read, the retriever has no external corroboration and defaults to the brand that does. You don't need to astroturf — engaging honestly in two relevant subs a week for a quarter usually shows up in the training-data and retrieval mix.

14. Category-level llms hint

A single llms.txt at the root is the baseline; a second one at each category path (e.g. /collections/desks/llms.txt) that names your canonical PDPs for that category sharpens the retriever's ability to disambiguate when the shopper asks a category question.

Figure 2 — Citation share before and after closing the six misses. Six weeks, same store, no new paid spend. ChatGPT 0% → 28%, Perplexity 0% → 34%, AI Overviews 0% → 11%, Claude 0% → 19%.

What moves first

We've run this diagnostic on 1,200+ stores. The distribution is boring: the same six misses explain 9 out of 10 absences. If you do only these six, in this order, the needle moves inside a quarter.

Ship /llms.txt at the site root — 3KB, lists your 20 canonical URLs. Do this today.
Delete any Disallow for GPTBot, ClaudeBot, PerplexityBot from robots.txt. Redeploy, wait 72 hours.
Add FAQPage JSON-LD to every collection and bestseller PDP — five honest questions minimum.
Rewrite the first paragraph of your top 20 PDPs to 50-55 words, facts-first.
Publish at least one honest comparison page against your strongest competitor.
Align schema:Organization name with your storefront brand + domain. Redeploy.

How long until you're in the citation set?

Shortest path: 6 days (llms.txt + GPTBot unblock + PDP lead rewrite for one product, running against a prompt set that's specific enough to your SKU). Typical: 4–6 weeks to reach a meaningful share of voice. The compounding effect matters — each fix is small, but the retriever behaviour is non-linear. A store on 4 of the 14 checks is functionally invisible. A store on 10 of 14 gets picked regularly.

One last calibration

Being absent from ChatGPT is not a brand problem. It's an infrastructure problem presenting as a brand problem. Stores with strong brands get cited as easily as stores with weak brands — assuming both have done the 14 things above. The stores that are absent are almost always absent because nobody on the team has owned this work, not because the product or the marketing is wrong.

Tags:ChatGPTShopifyGEODiagnosticllms.txt

Why your Shopify store isn't in ChatGPT

Three categories, in this exact order

Layer 1 — Crawl (five checks)

1. llms.txt at the root

2. GPTBot allowed in robots.txt

3. Sitemap complete

4. Canonical tags

5. HTML2 renderable pass

Layer 2 — Signal (five checks)

6. Product + Offer JSON-LD

7. FAQPage schema

8. Review density

9. Freshness date

10. Brand name matches domain

Layer 3 — Narrative (four checks)

11. PDP lead under 55 words

12. Honest comparison page

13. Reddit presence

14. Category-level llms hint

What moves first

How long until you're in the citation set?

One last calibration

Frequently asked questions

See how your Shopify store scores with AI engines

Sources & further reading

Why GEO beats SEO for Shopify merchants in 2026

llms.txt for Shopify merchants — the full playbook

Anatomy of an AI-cited product page

Shopify AI SEO checklist

GEO Score audit

Related reading

llms.txt for Shopify — the 20-minute setup

Why GEO beats SEO for Shopify merchants in 2026

llms-full.txt — the file that lifts citations 30%

Why your Shopify store isn't in ChatGPT

Three categories, in this exact order

Layer 1 — Crawl (five checks)

1. llms.txt at the root

2. GPTBot allowed in robots.txt

3. Sitemap complete

4. Canonical tags

5. HTML2 renderable pass

Layer 2 — Signal (five checks)

6. Product + Offer JSON-LD

7. FAQPage schema

8. Review density

9. Freshness date

10. Brand name matches domain

Layer 3 — Narrative (four checks)

11. PDP lead under 55 words

12. Honest comparison page

13. Reddit presence

14. Category-level llms hint

What moves first

How long until you're in the citation set?

One last calibration

Frequently asked questions

See how your Shopify store scores with AI engines

Sources & further reading

Keep reading

Why GEO beats SEO for Shopify merchants in 2026

llms.txt for Shopify merchants — the full playbook

Anatomy of an AI-cited product page

Shopify AI SEO checklist

GEO Score audit

Related reading

llms.txt for Shopify — the 20-minute setup

Why GEO beats SEO for Shopify merchants in 2026

llms-full.txt — the file that lifts citations 30%