Skip to main content

Surfient module · Guard

Catch AI engines misquoting your brand — and file the correction

Surfient continuously probes ChatGPT, Claude, Perplexity, and Google AI Overviews with shopper-realistic prompts, parses every claim about your products, and alerts you when a model says something that isn't true.

  • Weekly prompt panel across four major engines surfaces every claim the model makes about your store — prices, specs, policies, made-of, guarantees.
  • A fact-verifier cross-checks each claim against your live Shopify catalog and approved Brand Facts — mismatches become hallucination events with severity scores.
  • Hallucination events route to a correction workflow: update the source of truth, strengthen the citation surface, and re-query the model to confirm the fix landed.

Hallucination Guard

Flagged claims → ground-truth fixes

7 corrected
Hallucination flagged
Perplexity · 2m ago

What the engine said

“The Alpine Merino base layer is machine washable up to 90°C.”

Ground truth from your PDP

Source: product.json

Cold hand-wash only. Machine dryer will felt the fibres.

0

flagged

0

resolved

0

escalated

The problem

An AI assistant got your spec wrong — and you have no way to know

Your customer asked Claude whether your jacket is waterproof. Claude said yes — except it's water-resistant, and the customer's a return waiting to happen. Nobody emailed you. Nobody reported a bug. The misquote just happens, repeatedly, in private conversations, until your return rate spikes and you're left guessing why.

  • 1 in 4

    AI product-spec claims contain at least one factual drift from the source

    Our sampling across 6,000 product-spec prompts, January-March 2026. Drift ranges from trivial (off by 2%) to severe (wrong material, wrong warranty).

  • $0

    is what the standard Shopify reputation stack spends defending against AI misquotes

    Because the stack doesn't know they're happening. A misquote that drives 40 returns a month is invisible in every standard reporting tool.

  • 18 days

    median time between a schema correction on-page and the model updating its answer

    Faster than you'd guess. Once the fix is published at the source, most engines refresh within a fortnight — but only if you know to push the fix.

How it works

Probe, verify, correct — on a rolling 7-day cadence

You can't manually audit an AI for every product. The Guard does it automatically, with an explicit verifier loop so every flag is evidence-backed.

  1. Probe with shopper-realistic prompts

    The prompt panel for the Guard runs adjacent to the Visibility Monitor's — same engines, different prompts. Claims-probing prompts are phrased like a customer asking a specific question: "Is the Aurora jacket waterproof?" rather than "Tell me about it." This surfaces the facts a real buyer would act on.

  2. Extract every verifiable claim

    A claim extractor walks every answer and pulls assertions of fact: price, material, dimensions, warranty period, shipping window, country of origin, stock status. Each claim is structured as a proposition tied to a product and an engine.

  3. Verify against source of truth

    Each claim is checked against your live Shopify data (for catalog facts) and your approved Brand Facts profile (for brand claims). A mismatch past the configurable tolerance becomes a hallucination event. Events are severity-scored by how likely they are to cause a purchase decision or a return.

  4. Correct at the surface — and confirm the fix

    Each event carries a suggested correction: update the product's schema, add a Brand Fact, add an FAQ entry that disambiguates. Apply the correction through the Fix Pack, and the Guard re-queries the engine over the following days to confirm the misquote is gone. Close the loop — or escalate.

Inside the app

What you’ll see after install

Every number a Shopify merchant running Surfient Hallucination Guard tracks in one glance — live from the Surfient admin. AI engine splits, revenue lift, and the exact state of your catalog across ChatGPT, Perplexity, Google AI Overviews, Claude, Gemini, and Copilot.

Capabilities

What the Guard actually catches

Seven classes of hallucination, all of them common, all of them silent if you aren't looking.

  • Spec misquotes

    Wrong material, wrong dimensions, wrong weight, wrong colour availability. The commonest and most-returns-driving class. Every product's full spec sheet is in the verifier, so drift as small as a 5% off dimension is caught.

  • Price misquotes

    AI engines cache prices longer than you think. A product whose price changed six weeks ago can still be quoted at the old price — especially dangerous when the old price was lower. The Guard flags stale pricing, and the Fix Pack pushes the schema refresh to accelerate the model's update.

  • Policy misquotes

    "Free returns for 60 days" when yours is 30 is a reputational landmine. The Guard parses policy claims (returns, shipping, warranty) and verifies them against your Shopify policy pages and Brand Facts. Mismatches route to legal + support for the correction.

  • Made-of / ingredient drift

    Especially critical for beauty, food, and apparel. A model claiming your product is vegan / gluten-free / organic when it isn't is a compliance and reputation risk. The verifier treats these claims at sev-1 severity.

  • Availability drift

    AI engines continuing to recommend out-of-stock SKUs erode trust — the shopper clicks, sees "sold out," and bounces. The Guard flags claims that contradict live inventory and the Fix Pack schedules schema refreshes to encourage engines to re-verify.

  • Competitor confusion

    Sometimes models conflate two brands with similar names or adjacent categories. The Guard surfaces answers where a competitor's product is described under your brand name (or vice versa), with evidence snippets to support a formal correction request where supported.

  • Citation-to-source mismatch

    A model citing your URL while making a claim your page doesn't contain. Happens more than you'd expect — engines mix context across answers. The Guard flags these as "unsupported citations" and the Content Engine drafts the text that makes the citation valid.

  • Severity scoring + reputation weighting

    Not every drift is a fire. The Guard scores severity by likelihood-of-purchase-impact and likelihood-of-return, and weights by your reputation surface: a hallucination on your #1 bestseller outranks one on a long-tail SKU. Alerts focus attention where it matters.

Customer proof

Proof

We caught ChatGPT claiming our cast-iron pans were non-stick coated. They're seasoned iron — the exact opposite. Within a fortnight of shipping the fix, the misquote was gone and our return rate on that SKU fell by a third.
Yuki Tanaka · Customer Experience Lead, Ember & Hearth

-33%

return rate on flagged SKU

FAQ

Questions, answered straight

  • Can you actually stop an AI from hallucinating?

    No — not at the model level. What you can do is make it harder for the model to hallucinate about your brand specifically: publish clear structured data, publish an llms-full.txt with approved facts, and keep product schema fresh. The Guard surfaces every drift, the Fix Pack closes the source gap, and over two to three weeks the engines' answers converge on correct. We've measured it across 80+ customer brands.

  • What triggers a hallucination event?

    A model answer contains a claim that, when extracted and verified, contradicts your live Shopify data or approved Brand Facts by more than the tolerance you set (default: 5% on numerics, strict equality on categorical facts like material or warranty period). Every event includes the full answer text, the extracted claim, the source of truth, and the delta.

  • Does this work for brands that have thin structured data today?

    Yes — but you'll get more signal once the structured-data surface is solid. We'd normally recommend running the GEO Audit Engine first, shipping the top-10 audit fixes via the Fix Pack, and then turning the Guard on. The tighter your source of truth, the more confident the verifier is in flagging drift.

  • What if the model is right and my site is wrong?

    The verifier flags the contradiction either way. Sometimes the truth is that your Shopify data is stale and the model picked it up from a fresher source (a spec sheet you uploaded elsewhere, a press release). You can accept the model's version and update your source — the event closes the same way.

  • How often does the Guard run?

    Weekly by default, with an option to run a subset of high-risk prompts daily. After applying a correction via the Fix Pack, the Guard schedules a re-query cascade (48 hours, 7 days, 14 days, 28 days) against each engine to confirm the misquote has cleared — or escalate if it hasn't.

  • Does the Guard replace my PR team for AI-era reputation work?

    No. The Guard gives your PR and customer-experience teams data they didn't have before. The creative work of drafting a formal correction, reaching out to an engine's policy contact, or adjusting brand strategy still belongs to humans — the Guard just makes sure those humans aren't flying blind.

Find out what AI engines are saying about you — this week

A free hallucination scan runs 50 shopper-realistic prompts against ChatGPT, Perplexity, Claude, and Google AI Overviews, extracts every claim, and flags the drifts. You'll know what needs fixing before you decide whether to pay for anything.