Is every long audit theatre, or can a 60-page audit be genuinely useful?

Length isn’t the tell — structure is. A 60-page signal audit will have 40+ of those pages dedicated to the baseline measurement dataset (prompt-level citation CSVs, engine-by-engine breakdowns, competitor comparison matrices) and 15-20 pages of tight prioritised recommendations. A 60-page theatre audit will have 55 pages of generic checklist items and 5 pages of filler. Flip to the data appendix. If there’s no prompt-level raw data, the length is padding.

How do I handle an audit we paid $15-30K for that scores under 15 on this scorecard?

Document the gap specifically: for each of the six dimensions, note what the audit delivered versus what it should have. Send it back to the auditor and ask for a revision covering the missing dimensions at no additional cost. Reputable firms will revise; theatre firms will argue. Either way, don’t pay for a second audit from the same vendor. Use the documented gap as a brief for the replacement engagement — it helps the next auditor scope faster and gives you clearer acceptance criteria.

Can I run this scorecard on an internal audit produced by my in-house SEO team?

Yes, and it’s often more useful there than on agency work. Internal audits fail the scorecard most commonly on dimensions 01 (engine specificity) and 05 (cross-surface comparison) because internal teams have limited exposure to the specific retrieval behaviour of each engine and often lack the tooling to benchmark against competitors systematically. The fix isn’t to fire the in-house team — it’s to pair them with one external signal audit per quarter so they calibrate against real retrieval data.

What’s a fair price range for a signal audit in 2026?

For a mid-sized Shopify brand ($5-20M revenue), a signal audit priced at $12-25K including baseline measurement, competitive comparison, and a 60-day re-measure is fair. Below $8K, the auditor can’t afford to do the measurement properly. Above $45K for a standalone audit (not paired with ongoing work), you’re paying for brand premium more than incremental diagnostic depth. Enterprise audits ($40K+) make sense when they’re scoping a year-long engagement, not as one-off diagnostics. Ask what portion of the fee covers measurement vs deliverable writing — signal audits typically split 50-60% measurement, 40-50% synthesis and write-up.

Our audit recommended meta description rewrites and increased word count — are those always worthless?

Not worthless for conventional organic Google, but neutral-to-negative for AI citation. Meta descriptions don’t influence AI retrieval (the retrieval pass is working with full page content or extracted embeddings, not description snippets). Increased word count actively hurts quotability on ChatGPT and Perplexity because longer pages dilute the claim-density signal that retrievers favour. If an audit leads with these, it’s confusing SERP-era ranking factors with GEO-era citation mechanics. You can do the rewrites if your ops team already has them scoped — just don’t expect citation lift from them, and don’t let them consume sprint capacity that should go to schema or FAQPage work.

“GEO audits” are 90% theatre

Most “GEO audits” you see in 2026 are still traditional SEO audits wearing a new jacket. Eighty pages of crawl statistics, meta description rewrites, internal link suggestions, and word count recommendations — none of which move AI citation share. We’ve reviewed forty-two GEO audit PDFs from agencies, consultants, and in-house teams in the last six months. Thirty-nine of them were theatre. This post is the scorecard we use to tell theatre from signal — so you can grade any audit in under twenty minutes before you pay for it.

What makes an audit “theatre”

Theatre audits borrow the credibility of traditional SEO auditing — comprehensive-looking, screenshot-heavy, checklist-exhaustive — without doing the actual job GEO auditing requires: reasoning about retrieval. The tell is always the same. Open the findings section, and every recommendation could have been generated by a 2019-era Screaming Frog report: add more internal links, increase word count, add alt text, compress images, fix broken links, update the sitemap. These aren’t wrong per se — a few still matter for conventional SERP — but none of them change whether Perplexity, ChatGPT, or AI Mode cites your site.

A signal audit starts from a different question: “If a buyer asks a retrieval-augmented LLM about this category, what does the engine actually do, and where is this merchant failing in that pipeline?” Answering requires a mental model of each engine’s retrieval pool, eligibility gates, and ranking signals — plus hands-on diagnosis of the specific schema, content, and corroboration patterns that determine inclusion.

Two-column comparison. Left red column lists nine theatre findings like Add more internal links, Update meta descriptions, Increase word count to 2000 plus, Add alt text, Compress images, Submit sitemap, Fix H1 structure, Install HTTPS, Mobile friendly test. Each has a subtext explaining why it does not move AI citation share. Right cyan column lists nine signal findings like MerchantReturnPolicy missing on 47 PDPs, FAQPage missing on top 20 categories, Reddit presence zero in 4 of 5 categories, Product offer price stale on 12 PDPs, Claude retrieval path untested, Perplexity shopping mode eligibility failing, ChatGPT citation frequency decaying, AI Mode turn 2 coverage missing, cross-engine share monitoring absent. Each has a subtext explaining the retrieval mechanic it fixes. — Theatre column (left) versus signal column (right). Left: nine findings that look thorough but don't move AI citation share. Right: nine findings that map directly to retrieval mechanics on Perplexity, ChatGPT, and Google AI Mode.

The six dimensions of a signal audit

We grade audits on six dimensions, five points each, for a total of thirty. Any audit scoring under fifteen is theatre; anything above twenty-one is worth paying for; the middle band is “ask the auditor three follow-up questions before buying.” The six dimensions sit at different layers of the retrieval pipeline, and a strong audit touches every one of them.

Six-row scorecard. Row 1 engine specificity 5 points: does the audit name Perplexity ChatGPT Claude AI Mode separately or treat AI as one entity. Row 2 baseline citation measurement 5 points: is there a cohort of 50 plus prompts with measured citation share before tactics. Row 3 schema diagnosis depth 5 points: does the auditor view-source and reason about Product Offer MerchantReturnPolicy FAQPage. Row 4 content extractability 5 points: quotability claim density honest spec disclosure. Row 5 cross-surface comparison 5 points: does the audit compare your site against 3 to 5 competitors on same prompts. Row 6 remediation sequencing 5 points: are fixes ordered by expected citation impact and effort. Totals on right show theatre average 6 out of 30 and signal average 27 out of 30. — The six-dimension audit scorecard: engine specificity, baseline citation measurement, schema diagnosis depth, content extractability, cross-surface comparison, and remediation sequencing. Theatre audits average 6/30; signal audits average 27/30.

Dimension 01 — Engine specificity

Theatre audits treat “AI” as a single entity. They say things like “optimise for AI search” or “AI-friendly content.” Signal audits name each engine separately because each has a different retrieval architecture. Perplexity runs a three-pass retrieval with explicit citation. ChatGPT uses Bing search-grounded retrieval with OpenAI-weighted re-ranking. Claude pulls from a narrower curated pool. Google AI Mode synthesises from organic SERP results plus AI Overviews grounding. An audit that doesn’t name engines specifically cannot diagnose specifically.

Dimension 02 — Baseline citation measurement

If the auditor didn’t measure your current citation share on a cohort of at least 50 category-relevant prompts before writing the audit, you’re looking at a checklist, not a diagnosis. The baseline is the control group. Without it, there’s no way to tell which recommendations moved the needle and which were noise. Signal audits include a CSV of the exact prompts, the citation presence per engine, and the citation rank position where relevant. Theatre audits say “your AI visibility is low” with no supporting data.

Dimension 03 — Schema diagnosis depth

The auditor should have opened view-source on your product pages, collection pages, and homepage. They should be able to tell you whether your Product node has a valid Offer, whether MerchantReturnPolicy is present, whether FAQPage schema is on the right templates, and whether conflicting apps are emitting duplicate or contradictory blocks. Theatre audits say “add schema.” Signal audits show you the exact blocks you’re missing with line numbers and remediation copy you can paste into Liquid or metafields.

Dimension 04 — Content extractability

Retrieval systems quote content. The question the auditor should answer is: which sentences on your site are quotable under a 400-token budget? Which claims are extractable as standalone facts? Which comparisons are honest enough to survive the LLM’s unsupported-claim detection? Theatre audits recommend “increase word count” and “improve readability.” Signal audits grade your top 20 landing pages on claim density, quotability, and honest spec disclosure, with rewrite examples for the lowest performers.

Dimension 05 — Cross-surface comparison

Your audit is only as useful as the competitive context it establishes. Signal audits compare your site against three to five direct category competitors on the same prompt cohort, on the same engines, in the same week. Theatre audits list your flaws in isolation — which is useless, because the retrieval pool is adversarial. You don’t need to be perfect; you need to be the best citation candidate in your category for the prompts that matter.

Dimension 06 — Remediation sequencing

A stack of 120 findings with no priority order is not an audit — it’s a bibliography. Signal audits order fixes by expected citation-share impact divided by implementation effort, and they group related fixes into sprints. Theatre audits give you a flat checklist and leave you to guess.

The signal audit’s end state: three fixes, not thirty

The most counterintuitive property of a signal audit is that it usually ends with three to five high-leverage fixes, not thirty. Theatre audits pad to look exhaustive because the asymmetric-information dynamic rewards the appearance of thoroughness. Signal audits trust their model of retrieval enough to say, “these three fixes will move your citation share measurably in the next 60 days; the rest is noise.” When a consultant hands you a 120-item fix list, they’re telling you they don’t have a model of which items move the needle.

The audits we’ve seen produce the biggest actual lift in client citation share share this profile: a tightly-scoped brief (often 14-22 pages, not 80), three to five prioritised fixes with implementation code, a baseline measurement CSV, and a follow-up measurement plan at 30/60/90 days to verify impact. Anything dramatically longer than that is usually hiding weak prioritisation behind volume.

How to score an audit you already have

If you’ve already paid for an audit and want to know whether to act on it: run the six-dimension scorecard yourself. Open the audit, go to each of the six dimensions above, and score 0-5 based on what the document actually contains. Total <15 → set aside and commission a signal audit; acting on the current one will burn team cycles with low return. Total 15-21 → actionable for the specific dimensions that scored 4+, ignore the rest. Total >21 → work through it in priority order and re-measure in 60 days.

Demand engine-specific diagnosis. An audit that lumps “AI search” into one bucket cannot tell you where Perplexity differs from ChatGPT, and therefore cannot tell you where to invest.
Require a baseline CSV, not just a score. The raw prompt-by-prompt citation data is the control group. Without it, you can’t measure whether the recommended fixes worked.
Reject flat checklists. If the audit lists 40+ fixes with no priority order, it’s transferring the hard work to you.
Ask for three-fix sprint plans, not 120-line bibliographies. Signal audits end in sprints, not libraries.
Lock in a re-measure clause. Any auditor confident in their recommendations will agree to a 60-day follow-up measurement. Auditors who won’t agree to re-measure are selling theatre.

Closing — audit the audit

The cost of a theatre audit isn’t the invoice — it’s the three to six months your team spends executing against the wrong recommendations while your actual citation share quietly decays. An audit without retrieval reasoning is decor. Grade it before you pay for it; re-grade it before you act on it.

Tags:auditsgeoagenciesprocurementquality

“GEO audits” are 90% theatre

What makes an audit “theatre”

The six dimensions of a signal audit

Dimension 01 — Engine specificity

Dimension 02 — Baseline citation measurement

Dimension 03 — Schema diagnosis depth

Dimension 04 — Content extractability

Dimension 05 — Cross-surface comparison

Dimension 06 — Remediation sequencing

The signal audit’s end state: three fixes, not thirty

How to score an audit you already have

Closing — audit the audit

Frequently asked questions

See how your Shopify store scores with AI engines

Sources & further reading

What agencies miss about GEO

GEO budget allocation framework

The GEO hiring rubric

Related reading

How to read your GEO score: 4 sub-score patterns

The GEO hiring rubric: 5 competencies

The prompt library that replaces your keyword list

“GEO audits” are 90% theatre

What makes an audit “theatre”

The six dimensions of a signal audit

Dimension 01 — Engine specificity

Dimension 02 — Baseline citation measurement

Dimension 03 — Schema diagnosis depth

Dimension 04 — Content extractability

Dimension 05 — Cross-surface comparison

Dimension 06 — Remediation sequencing

The signal audit’s end state: three fixes, not thirty

How to score an audit you already have

Closing — audit the audit

Frequently asked questions

See how your Shopify store scores with AI engines

Sources & further reading

Keep reading

What agencies miss about GEO

GEO budget allocation framework

The GEO hiring rubric

Related reading

How to read your GEO score: 4 sub-score patterns

The GEO hiring rubric: 5 competencies

The prompt library that replaces your keyword list