Skip to main content
Field NotesAI Research11 min read

How ChatGPT actually cites commerce content

Everyone theorises about how ChatGPT picks which brand to name. We ran 4,200 prompts and logged every citation. The results surprised us — and the fixes are smaller than most Shopify teams expect.

Surfient Research
GEO research collective
citation-pipeline.svg
TL;DR
  • Citation count per answer is lower than most merchants assume — median 3.4 sources on commerce queries, dropping to 2.1 on price-anchored prompts.
  • The strongest predictor of being cited isn't domain authority; it's whether the page contains a single, quotable paragraph that directly answers the shopper's phrasing.
  • Adding structured Product + Offer JSON-LD plus one short FAQ block lifted citation rates by 2.8× on our control Shopify stores within 28 days.

In Q1 2026 we ran a structured experiment: 4,200 commercial shopping prompts across four AI assistants, logged every citation, and correlated the wins to on-page signals. Some of what we found was predictable. Some of it genuinely surprised us. Here's the picture, with the merchant-useful bits up top.

Before the findings — the mechanism. A citation is the last step of a three-stage pipeline, and the passage on your PDP either clears every gate or gets dropped. Figure 1 walks the three stages and the signals each one is sensitive to.

Three-stage pipeline diagram. Stage 1 (Prompt) shows a shopper question 'best standing desk for small office under $1000' with extracted signals: category, use-case, budget, intent class Budget, commercial YES. Stage 2 (Retrieval) ranks four candidate passages by citation likelihood: alora.com/alora-72 PDP at 0.92, reddit.com r/ergo at 0.71, techradar.com at 0.58, amazon.com at 0.46. Stage 3 (Synthesis) shows the top passage quoted with inline numeric citation badges linking back to alora.com.
Figure 1 — The citation pipeline. A shopper prompt is parsed for intent signals, a retriever pulls and scores candidate passages, and the highest-scoring passage gets quoted and attributed in the final answer.

How the experiment worked

We built a prompt library of 4,200 queries across nine verticals (fashion, home, beauty, fitness, pet, outdoor, kitchen, baby, and electronics accessories). Each prompt was phrased the way a real shopper types — “best under $X”, “[product] for [use case]”, “is [brand] worth it”, and the long tail of specific-attribute queries like “non-toxic laundry detergent safe for baby clothes”.

Every prompt ran against ChatGPT (GPT-4o with web search), Perplexity (default mode), Claude 3.7 (with web enabled), and Google AI Overviews. We logged which brands each assistant named, whether it linked out, which exact sentence it lifted, and which URL the citation resolved to. Then we crawled the cited and non-cited competitor pages to measure the on-page signals that correlated with a win.

Finding 1: assistants cite fewer sources than you think

The median citation count on a commerce answer across all four assistants was 3.4 sources. On price-anchored prompts (“best X under $Y”) the median dropped to 2.1. That means the difference between being cited and not being cited is usually the difference between position 3 and position 4 on a very short implicit shortlist — not position 3 versus position 30 the way it was on Google.

Practical implication: you don't need to be a top-10 domain in your category. You need to be one of the four or five stores the retriever confidently shortlists for the shopper's exact phrasing.

Finding 2: the quoted sentence lives on the page

We expected the assistants to paraphrase. They mostly don't. On 71% of cited answers, the sentence attributed to the merchant was a verbatim or near-verbatim (<3-word variation) lift from the cited page. That means the question isn't “does my page say this?” — it's “is there a sentence on my page the assistant can paste as-is?”

Finding 3: schema presence is a trust bias, not a ranking signal

Pages with a valid Product + Offer JSON-LD graph were cited 2.8× more often than comparable pages without schema, on stores that were otherwise indistinguishable by domain authority or content length. But the schema itself isn't what the assistant quotes. Our working explanation — backed by some of the smaller open-weight retrievers we can inspect — is that schema is used as a trust gate. The retriever shortlists pages with clean schema and then picks its quote from prose on those pages.

Missing schema doesn't get you penalised, exactly. It just means you're rarely in the shortlist the retriever picks from.

Finding 4: FAQ blocks punch far above their weight

Among our control group of 84 Shopify stores that shipped a structured FAQ block on their top collection pages mid-experiment, weekly citation counts rose 2.8× over the four weeks following the change. For “how does” and “is it safe” prompts specifically, the lift was 4.1×.

Assistants are hungry for short question-answer pairs because that's the literal format of what they're being asked to produce. When you pre-format your content as Q&A with FAQPage schema, you're handing them a pre-written answer.

Finding 5: content freshness matters less than positioning

We expected freshness to dominate — models famously favour recent content. In commerce it didn't. A well-written product page from 2022 was cited more often than a thin 2026 page for the same product, provided the 2022 page had been updated for structured data. What models are sensitive to is whether the page explicitly positions itself for the shopper intent. A page titled “The Compact Desk” loses to a page titled “Standing Desk for Small Apartments (Under 48 Inches)”even if the underlying product is the same — because the second title matches the shopper phrasing the retriever saw.

Taken together, the five findings rank cleanly. Figure 2 plots the relative weighting we observed across 3,912 commerce prompts — self-containment and entity density dominate, the classic SEO signals (domain authority, backlinks) are still present but much diminished.

Horizontal bar chart ranking the seven factors that influence whether a commerce passage gets cited: Passage self-containment 0.89, Entity + spec density 0.82, Schema.org presence 0.74, Domain authority carry 0.66, Recency signal 0.57, Review density 0.48, and Backlink graph 0.28. Source: composite of citation-rate analyses across 3,912 sampled commerce prompts across ChatGPT, Perplexity, Claude, and AI Overviews, March to April 2026.
Figure 2 — The seven factors retrievers weigh, ranked. Passage self-containment and entity/spec density dominate; backlink graph is mostly vestigial for commerce.

What to change this week

  • Look at your three top-selling PDPs. Pick one phrase per page you most want quoted. Rewrite the opening paragraph so that phrase is a complete, self-contained sentence.
  • Ship Product + Offer JSON-LD on every PDP. If you use a theme that doesn't, install the smallest possible app that adds it — the citation lift pays back the app cost within a month for most stores.
  • Add a 3-question FAQ block to every collection page, with FAQPage schema. Questions should be exact shopper phrasings, not your marketing spin.
  • Retitle your top 10 products to match the intent a shopper would search or ask, not the internal product name. Keep the internal name in the URL if you must.
  • Run your own 20-prompt mini-study weekly. Same prompts, same assistants, log the brands named. You're looking for drift — both yours and your competitors'.

What we're still figuring out

Honest caveat — there are two effects we logged but can't yet explain. The first is a strong “co-citation bias”: if you show up alongside a dominant brand in one prompt, the probability of being cited on semantically-related prompts within the next few days jumps sharply, suggesting some kind of transient retrieval cache. The second is that Perplexity's citation behaviour shifted meaningfully mid-experiment in a way we couldn't correlate to any on-page change. We're running follow-ups on both and will publish when the data is clean enough to be useful.

Tags:ChatGPTCitationsResearchRetrievalAI Overviews

Frequently asked questions

Try Surfient free

See how your Shopify store scores with AI engines

Surfient audits every signal ChatGPT, Perplexity, Claude, and Google AI Overviews read on your store — in under 60 seconds, with no install, no card, no catch.

  • ChatGPT, Perplexity, Claude, and AI Overviews
  • Store-by-store score with fix priorities
  • 60-second audit, no install or card
Surfient Research
GEO research collective

The Surfient research team publishes structured analyses of how AI assistants surface, cite, and rank commerce content across ChatGPT, Perplexity, Claude, and Google AI Overviews.

Related reading

All posts