Beauty has always been the category where AI engines have the strongest opinion. Shoppers ask retrievers questions like "what's a non-comedogenic retinol for combination skin under $40?" with the full expectation of a shortlist — and they get one. The interesting question for Shopify merchants is whose products the retrievers shortlist. We measured it: 4,820 prompts across the three major engines, 320 Shopify beauty stores in the candidate pool, Q1 2026. Ten brands own 82% of what comes back.
The panel
We built the prompt panel from three sources: SEMrush's shopping-intent keyword set for beauty (1,100 queries), the top-asked beauty questions in Reddit's r/SkincareAddiction and r/MakeupAddiction over the trailing 90 days (2,200 prompts), and 1,520 prompts our own customers ran through the Surfient panel on their brand-research loops. We ran each prompt once against ChatGPT (GPT-5), once against Claude (Sonnet 4.5), and once against Perplexity (default mode) in early February, parsed the cited URLs, and attributed each citation to a Shopify store in our candidate pool.

The three things the panel tells you
First, the concentration is real but not absolute. The top 3 brands together own 46.7% of citations. That feels brutal — until you look at the next 7, which collectively earn another 36%. The difference between rank 3 and rank 10 is 12 points of citation share, which is achievable: Tower 28 closed 4 points of gap in the trailing 6 months by shipping ingredient metafields and the six-question FAQPage template.
Second, the engines don't agree. On ChatGPT, Supergoop and Drunk Elephant are nearly tied. On Claude, Drunk Elephant is #2 behind Glossier by a wide margin and Supergoop is #6. On Perplexity, The Ordinary overtakes Glossier. The per-engine concentration numbers tell the rest: Claude is cautious and concentrates on legacy names; ChatGPT will surface a mid-tier brand if the product page answers the specific question.
Third, the tail matters. Those 312 mid-tier Shopify beauty stores split 18% of citations — which, on 4,820 prompts, is about 868 individual citations. That's 2.8 citations per store per quarter on average. For a store doing $2M/year, even a 2× lift from 2.8 to 5.6 citations translates to an incremental $40-80K depending on the prompt intent mix. It's not trivial money, and it's not hard to earn.
Why the top 10 win — six structural patterns
We pulled the full product feed for each top-10 store plus a random 50 from the 312-store tail. Six patterns showed up repeatedly on the winners and rarely on the next tier. None require a bigger marketing budget.

Pattern 1 — Ingredient-level metafields
Every top-10 brand except one ships ingredient lists as structured arrays, not paragraphs. Each array entry is { inci, percentage, function }. The Ordinary is the canonical example: "Niacinamide 10% + Zinc 1%" is not a marketing tagline, it's the literal metafield value surfaced into their JSON-LD and llms-full.txt. AI retrievers grade fact-density; an ingredient array scores roughly 3× a prose paragraph of the same word count.
Pattern 2 — Skin-type concordance
Beauty shopping prompts are overwhelmingly shaped as "what's the best X for [skin type / concern]". 90% of our panel prompts fit this shape. The top-cited brands all have explicit skin-type metadata on every SKU, combined with concern tags (acne, hyperpigmentation, anti-aging) and contraindications (pregnancy, photosensitivity). Retrievers match the question shape to the metadata shape; stores without the metadata get filtered out before the generation step.
Pattern 3 — Published clinical or panel data
Six of the top 10 publish study-level numbers on efficacy claims: "78% of 224 participants showed measurable reduction in wrinkle depth at 12 weeks" with a method link. Marketing adjectives like "radiant" and "plumping" are ignored by retrievers. Panel sizes matter — n=50 is below the usefulness threshold; n=200+ gets consistently quoted. The Ordinary, Drunk Elephant, Supergoop, and Youth to the People all ship this; most mid-tier brands don't.
Pattern 4 — Honest FAQ comparison
Seven of the top 10 answer the comparison question in FAQPage JSON-LD (Q6 of our six-question template) honestly — naming prestige competitors and acknowledging where they win. This single FAQ answer is cited more than any other question type in beauty prompts because it's often the only place a brand acknowledges the real comparison.
Pattern 5 — Shade / finish normalization
Foundation and concealer brands that ship Fitzpatrick-scale and undertone codes (warm, cool, neutral) on each shade variant get recommended for "what foundation matches my skin tone?" prompts far more often than brands that ship only shade numbers. AI engines reason over the Fitzpatrick taxonomy; they don't reason over arbitrary shade codes.
Pattern 6 — A current llms-full.txt
Seven of the top 10 publish a current, regenerated-on-edit llms-full.txtwith the entire product catalog, ingredient data, and usage guides. Only two of the next 50 brands do. GPTBot and ClaudeBot visit this file daily on winning brands — we see the hits in logs. Our llms-full.txt post has the template; the beauty-specific additions are ingredient arrays and skin-type concordance tables.
What to do if you're on the tail
If you're one of the 312 Shopify beauty stores outside the top 10, here's the playbook that's worked for the brands we've watched climb.
- Audit one SKU first, not all of them. Pick your best-seller. Add ingredient metafields, skin-type concordance, and the six-FAQ template to just that one product. Measure citation change over 30 days before rolling to the full catalog.
- Run the panel weekly — all three engines. ChatGPT surfaces mid-tier brands first, Claude last. A weekly prompt run shows where the first wins are landing.
- Prioritise ChatGPT prompts in your panel. The ChatGPT tail is 4× longer than Claude's, so mid-tier brands see the first citation lift there. Shifting the Claude score takes months; shifting ChatGPT takes weeks.
- Publish a quarterly ingredient panel study. Even an n=80 in-house user panel with photographed before/after gets quoted if the methodology is on-page. Write it up like a study, not a marketing post.
- Write the Q6 comparison honestly. Name a prestige competitor, say where they win, say where you win. This is the single highest-leverage sentence you will write this quarter.
- Ship
llms-full.txtwith ingredients inline. Not the SEO prose — the actual structured data. Regenerate on every SKU edit. - Monitor crawler logs for GPTBot and ClaudeBot hitting your
llms-full.txt. First bot hit is usually 48-72 hours after you publish; first citation follows 10-14 days later.
What's ahead for Q2
The interesting question for Q2 2026 is whether the top 3 continue to concentrate or whether the tail closes. Our models show the top 3 probably flat or slightly down — AI engines are actively counteracting citation monoculture this year, and Google's AI Overviews in particular are surfacing more mid-tier brands in response prompts. The opportunity for mid-tier Shopify beauty stores isn't evenly distributed across time; it's front-loaded to the first half of this year. Ship the six patterns before Q3.