How many competitors should I include in the audit?

Three to five named competitors is the sweet spot. Fewer and you miss patterns; more and the analysis becomes unwieldy. Pick the three or four brands that show up most in your category's AI answers plus one or two aspirational competitors (brands bigger or more established than you, to see the ceiling signals). The panel will naturally surface additional brands as you run the prompts; capture those too, but do not expand the named-competitor set.

How long does a full gap audit take?

Two afternoons for the first one, maybe a third half-afternoon to write the roadmap. Phase one (panel build) is 2-3 hours with good inputs. Phase two (running the panel) is 3-4 hours. Phase three (diffing) is 2-3 hours. Phase four (roadmap) is 1-2 hours. After the first run, quarterly follow-ups compress to roughly 4-5 hours because the panel is stable.

Which AI engines should I include in the panel?

The five biggest for commerce in 2026 are ChatGPT, Google AI Overviews, Perplexity, Gemini, and Microsoft Copilot. Add Claude if your category is information-heavy (tech, research, professional services). Add Grok if you sell into community-driven categories (gaming, streetwear, creator-led brands). Add Kagi and You.com if your category skews technical or premium. Most merchants need six engines; specialists may need seven or eight.

How do I score citations across engines that present answers differently?

Normalise to two simple metrics: did your brand get cited (0 or 1) and, if so, where in the answer did the citation appear (rank 1-5). The answer format differs across engines — ChatGPT often gives a bulleted list, Perplexity gives a prose answer with footnoted sources, Google AI Overviews presents boxed summaries — but the cited-or-not binary and rough rank can be recorded consistently for all of them. Keep it simple; sophistication at the scoring level does not help the roadmap at the end.

Should I audit my own brand's presence in AI answers first, before comparing to competitors?

Yes. Run the panel against your own brand-inclusive prompts first — do you show up when the shopper already knows your name? If the answer is no, fix that before chasing category-exclusive prompts. You cannot expect to win prompts where no one has heard of you if you cannot even win prompts where someone has.

What if the audit reveals a competitor is winning because of something I cannot easily replicate?

Perfectly legitimate outcome. Some competitive advantages are structural — a 20-year-old brand, a massive review moat, first-party creator relationships. The audit tells you which gaps are worth pursuing and which are not. If a competitor's citation advantage is their 50,000 reviews, acknowledge that and focus the roadmap on gaps you can actually close — schema, content depth, structured data coverage. Honest recognition of unclosable gaps is a feature of the methodology, not a bug.

AI GuidesMeasurement + monitoring

How to run a competitor gap audit for AI search

A competitor gap audit tells you exactly which prompts your rivals are winning, which they are losing, and which structural signals separate their AI-cited pages from yours. Two afternoons of work, and the result is a prioritised roadmap rather than a guess.

Evan Mallick with Harry Parker

Generative Commerce Analyst

10 minUpdated April 21, 2026

Run free audit Read the guide

citation-rays.svg

Why a gap audit beats a generic strategy conversation

Generic AI strategy stalls because it lacks specifics. A gap audit produces a specific competitor, a specific prompt, a specific missing signal, and a specific fix.

Most conversations about 'improving AI visibility' stall because they lack the specificity required to act. The marketer knows they should 'show up more in ChatGPT' but has no concrete picture of which prompts they are missing, which competitor they are losing to, or which specific signal separates the cited pages from their own. Without that picture, every improvement move is a guess. A gap audit fixes the specificity problem directly — you leave the exercise with a list of exact competitor URLs, exact prompts they win, and exact signals they emit that you do not.

The other benefit is that gap audits compound. The 30-prompt panel you build for the first audit becomes the baseline for the quarterly follow-up — and because you have specific metrics (citation counts, source types, sentiment), the follow-up is a proper before-and-after measurement rather than a vibes-based check-in.

5-10

prioritised interventions produced by a typical two-afternoon gap audit

Surfient case-study average, 28 competitor gap audits completed for Shopify merchants, 2024-2026. Intervention count varies by category depth and competitor density.

step-flow.svgInfographic

Figure · step flowThe four-step arc this guide walks through — each numbered card maps to a section below.

Phase 1 — build the 30-query prompt panel

Mix brand-inclusive, brand-exclusive, and category-comparison prompts. Pull from real buyer research, not from SEO keyword tools.

The prompt panel is the core instrument of the audit. A weak panel (random keywords, SEO tool exports, generic questions) produces weak results. A strong panel reflects how real customers actually talk to AI assistants when considering your category — conversational, specific, and mixed across the buyer journey. We build 30 prompts across three types, evenly split.

Brand-inclusive (10 prompts): Include your brand name. 'Is [brand] good for X?', 'What do customers say about [brand]?', '[brand] vs [competitor]'. Tests your defence — are you winning prompts that already name you?
Category-exclusive (10 prompts): Do not name any brand. 'Best X for Y under $Z', 'What is the most comfortable running shoe for flat feet?', 'Recommend a sustainable leather bag brand'. Tests your offence — can you show up when the shopper has not heard of you?
Comparison (10 prompts): Name two or three brands including yours. '[Your brand] vs [competitor A] vs [competitor B] for X'. Tests head-to-head visibility and where AIs side with rivals over you.

Sources for real buyer prompts

Customer support chats and emails — the questions customers actually asked before buying. This is the single richest source.
Sales team call notes — what objections and comparison questions came up most often.
Reddit threads in your category — 'Help me decide between X and Y' style posts are live buyer prompts.
AnswerThePublic, AlsoAsked, and similar tools — useful but leave for last; they tend toward SEO-shaped phrasings rather than conversational ones.
Your own Google Search Console and Shopify search data — what people typed on your own site is what they would ask an AI.

Phase 2 — run the panel across engines and score every citation

Run each of 30 prompts across 5-6 engines. Log citations, source types, sentiment, and competitor mentions in a structured spreadsheet.

The second phase is execution — putting every prompt into every engine and recording structured output. It is tedious and takes the bulk of the audit's time, but the discipline of recording structured data is what lets you produce comparisons in phase three. Budget 3-4 hours of focused work; do not try to multitask this.

1Open a spreadsheet with columns: Prompt ID, Prompt Text, Engine, Your Citations (0/1), Your Rank (1-5), Competitor Citations (list), Source Types (web / reddit / forum / creator), Sentiment (positive / neutral / negative).
2Run each prompt in ChatGPT. Log citations and source types for your brand and every competitor mentioned.
3Repeat in Perplexity, Google AI Overviews, Gemini, Copilot, and Claude. Budget 30-45 minutes per engine.
4For each citation, note the specific URL cited. Those URLs become the basis of phase three diffing.
5Don't limit the panel to your named competitors — capture every brand that gets cited across the 30 prompts. The brands you didn't know about are often the most interesting findings.

3-4 hours

typical time to run a 30-query panel across six AI engines

Surfient methodology timing, averaged across 28 competitor gap audits. Includes prompt execution, citation logging, and structured-data capture but not the subsequent diff work.

Phase 3 — diff every cited competitor URL against your own

For each prompt you lost, compare the cited competitor page against your equivalent on schema, content, and off-site signals.

Phase three is where the 'gaps' become specific and actionable. For every prompt where a competitor beat you, open the cited competitor URL alongside your closest equivalent page and diff the two on structural signals. The goal is to find the recurring patterns — the signals that show up across competitor-winning pages and are missing from yours — because those patterns form the intervention roadmap.

Schema diff: Use Rich Results Test on both URLs. Log what schema types they emit (Product, FAQPage, BreadcrumbList, HowTo, Review, Article) and any fields present on theirs that are absent on yours.
Content depth diff: Word count, H2 structure, FAQ presence, spec depth, customer review count, photo count. The competitor's page is almost always deeper and more structured.
Content specificity diff: Do they answer specific sub-questions (wrist sizes, fit notes, compatibility)? Do they have numeric claims (water resistance to 5 ATM)? Your pages may be thinner on specifics.
Off-site signal diff: Where is the competitor mentioned that you are not? Reddit, forums, category publications, creator reviews. Use Surfient or manual site: searches to map.
Technical hygiene diff: Canonical coherence, llms.txt presence, ai-sitemap.xml freshness, hreflang correctness, server response time. Rare to be the reason but worth checking.
Feed / shopping signal diff: Do they appear in ChatGPT Shopping cards? In Google AI Shopping? Their merchant feed is likely cleaner or more complete.

Phase 4 — convert the gaps into a prioritised roadmap

Cluster recurring gaps into intervention themes. Prioritise by expected citation lift vs effort. Ship the top 3 in the first sprint.

The final phase translates the raw gap data into a roadmap. The recurring patterns across competitor-winning pages cluster into a handful of themes, typically 5-10. Each theme is an intervention you can resource against. Prioritise by expected citation lift (how many of the 30 prompts does this intervention potentially affect?) versus effort (how many hours to ship?) and rank accordingly.

The common intervention themes we see

Product schema completeness — competitor has aggregateRating, review, additionalProperty; you have only name/price/brand. Intervention: ship richer Product schema site-wide.
FAQ schema presence — competitor has FAQPage on every PDP; you have none. Intervention: author per-product FAQs and emit FAQPage schema.
Content depth on PDPs — competitor's PDP is 2,000 words with fit guide, care instructions, spec table; yours is 400 words marketing copy. Intervention: deepen PDPs with structured facts.
Off-site community presence — competitor is visible on 3-4 category Reddits and category forums; you are absent. Intervention: launch a 6-month Reddit participation programme.
Buying-guide coverage — competitor has 10 long-form buying guides for the category; you have none. Intervention: ship a buying-guide hub.
Review count and freshness — competitor has 200 reviews per top product, yours has 20. Intervention: invest in review acquisition.

High lift, low effort: Ship immediately. Schema gaps, feed gaps, canonical fixes. Target: first sprint.
High lift, high effort: Resource into a multi-month programme. Content depth, review acquisition, off-site community. Target: quarterly themes.
Low lift, low lift value: Backlog. Cosmetic fixes, minor schema refinements, small signal tweaks. Target: bundle with future sprints.
Low lift, unknown value: Test and measure. Hypotheses worth a week but not a quarter. Target: quarterly experimentation slot.

Running the audit on a quarterly cadence

First audit is exploratory. Subsequent quarterly audits reuse the prompt panel and become a clean before/after measurement.

The first gap audit is exploratory — you are building the panel, learning the engines, and discovering which competitors show up. By the second quarterly run, the panel is stable and the audit compresses into a much faster exercise. You reuse the 30 prompts, re-run across the engines, and compare the citation counts to last quarter's baseline. That comparison tells you which of your shipped interventions moved the needle and which did not.

1Q1 audit: exploratory. Build panel, run full phase 1-4. Ship top 3 interventions. 2 afternoons.
2Q2 audit: first measurement. Reuse panel, re-run, compare against Q1 baseline. Identify which interventions worked, which did not. Ship next 3. 1 afternoon.
3Q3 audit: pattern-recognition. Gaps are narrower and more specific. Competitor set may have shifted. Adjust panel slightly. 1 afternoon.
4Q4 audit: annual review. Revisit competitor set, revise panel, produce year-over-year trend. Inform next year's GEO budget. 1 afternoon.

“The merchants who win on AI visibility are not the ones with the cleverest single move. They are the ones whose gap audits quarter over quarter show consistent narrowing of the specific gaps they identified — because that narrowing is compounding, and compounding wins on long time horizons.”

— Evan Mallick, Generative Commerce Analyst, Onviqa Inc.

Frequently asked questions

Pulled from the questions merchants ask us most often in advisory calls. Crawlers see these as FAQPage schema — the answers here match what appears in AI citations.

Three to five named competitors is the sweet spot. Fewer and you miss patterns; more and the analysis becomes unwieldy. Pick the three or four brands that show up most in your category's AI answers plus one or two aspirational competitors (brands bigger or more established than you, to see the ceiling signals). The panel will naturally surface additional brands as you run the prompts; capture those too, but do not expand the named-competitor set.

Free · 5 minutes · no signup

Ready to see your store's GEO score?

Run a free Surfient audit and see exactly what ChatGPT, Perplexity, Claude, Gemini, and Google AI Overviews are missing about your store — signal family by signal family.

Run free audit See the platform

GEO score

Engine readiness

Technical indexing

Content fit

Live example — your number is ready in about 90 seconds.

Keep reading

Browse all AI Guides