Every SEO agency has a GEO slide now. Most of them are selling the same services under a new label: keyword research (now “AEO keyword research”), backlink campaigns (now “AI authority building”), and Google-ranking reports (now “SGE visibility reports”). Meanwhile, the measurement stack that actually moves citation share — the weekly panel, the log parser, the declarative PDP rewrite — is nowhere in the deck. Here's how to tell the difference, and a ten-question checklist to take into every pitch meeting.
The rebrand problem
We tracked 11 Shopify stores that signed with traditional SEO agencies on GEO programs branded as AI-optimisation or Answer-Engine Optimisation. At Day 90, 7 of 11 stores had zero citation-share movement on their panels. The stores that did move (4 of 11) moved because the agency happened to employ one writer who rewrote PDPs aggressively —not because of the keyword research or backlink campaigns the agency was billing for.
This isn't agencies being dishonest on purpose. Most of them genuinely don't know that GEO is different work. Their playbooks were built for Google ranking signals and they pattern-matched the new space onto familiar services. The result is a rebrand, not a new program. You're paying agency-tier prices for the exact deliverables they sold you two years ago.

What most agencies are actually selling
Walk through the typical GEO deck slide and see how many items map directly to the old SEO stack.
“AEO keyword research”
A list of prompts shoppers might type into ChatGPT. Useful if paired with a panel that actually measures your coverage — worthless otherwise. Most decks stop at the list. Ask where the list came from and how often it's refreshed.
“AI authority building”
Usually a backlink campaign with the word “AI” stapled on. Backlinks matter less in GEO retrieval than entity corroboration — a Wikidata entry, a Reddit thread that ranks, a niche reviewer's article. None of those come from a standard backlink campaign. Ask specifically which third-party surfaces they target and how they measure corroboration.
“Schema.org markup audit”
This one can be real work, but it's always a one-off deliverable. Schema drifts. Collections get new templates, themes get updated, Offer.url quietly disappears. Without a regression process (re-test monthly or on every theme change), one-off schema audits decay in weeks. Ask for the regression process, not the audit.
“Google SGE visibility report”
Measuring Google SGE alone ignores ChatGPT, Perplexity, Claude, and Gemini — the four engines where most purchase-intent traffic actually lives. A GEO report that only covers Google is half a program. Ask how many engines they measure and at what weekly cadence.
“Monthly ranking report”
Google keyword positions. Not citation share. Not AI-cited sessions. If your agency's monthly report is a keyword-ranking spreadsheet, they are not doing GEO.
What a real GEO program actually ships
Six deliverables. Each is measurable, each has a human owner, each shows up in a weekly or monthly cadence.
- Weekly 40-query citation panel across ChatGPT, Perplexity, Claude, and Gemini — reported as citation-share points per engine.
- PDP prose rewrite on the top-20% of SKUs — one claim per sentence, specific numbers, standards citations.
- FAQPage schema at collection level — 3-5 real shopper questions per collection, answered in 40-80 words.
- Monthly server-log analysis — GPTBot and PerplexityBot fetch patterns, 4xx/5xx trends, crawl budget per page class.
- External corroboration program — Reddit presence, reviewer outreach (10-25 per quarter), Wikidata entity maintenance.
- Content cadence — 2-3 buying guides per week filling named citation gaps from the weekly panel.
Ten questions for every GEO pitch
Print this checklist and take it to every vendor meeting. Three or more red-flag answers and you're buying the label.

The ten questions, with scoring
- How do you measure success weekly? — Good: 40-query panel × 4 engines → citation-share points. Red flag: monthly Google-ranking report.
- Who owns the citation panel? — Good: a named human with a dashboard deliverable. Red flag: 'our tool does it'.
- What's in the first-month deliverable? — Good: audit + top-10 rewrite + panel baseline + log parse. Red flag: keyword list + backlink plan.
- How do you rewrite PDP copy? — Good: one claim per sentence, specific numbers, standards citations. Red flag: AI-generated rewrites from a prompt template.
- What external corroboration work do you do? — Good: Reddit presence + reviewer outreach + Wikidata. Red flag: directory submissions, link building.
- Do you parse server logs? — Good: yes, GoAccess or BigQuery monthly. Red flag: 'we use Google Analytics'.
- Can I see a sample 30-day report? — Good: yes, here's one with engine breakdown. Red flag: 'we customise per client'.
- Who's on the team? — Good: named writer + analyst + dev with GEO experience. Red flag: pool of juniors.
- What happens at Day 60 if citation share hasn't moved? — Good: we stop shipping and diagnose. Red flag: we double content volume.
- What's the exit clause? — Good: 30-day notice, no lock-in. Red flag: 12-month minimum.
In-house vs. agency, honestly
The best GEO programs we've seen are in-house. The skill mix is: one writer who can write declarative specs (the hardest role to hire), one part-time analyst for the weekly panel (4-6 hours/week), and ~0.25 FTE of dev time for schema regression. That's ~1.3 FTE total, which many Growth-stage stores can staff for less than a $6K/month agency retainer.
The case for an agency (or an agency + tool like us) is when you need the tooling infrastructure — panel automation, log-parsing pipelines, dashboard builds — and you don't want to build it. That's a legitimate reason. What isn't a legitimate reason is “we don't know what GEO is, so we'll let the agency figure it out.” If you can't judge whether they're shipping the work, you'll pay for the label.