Skip to main content
Field NotesGEO Playbook11 min read

The prompt library that replaces your keyword list

Keyword lists are four columns of metadata optimised for single-turn typed SERPs. AI retrieval is multi-turn, multi-engine, and corroboration-weighted — the replacement artefact is a 7-column prompt library with a weekly discovery/sampling/measurement/response cycle. Here’s the schema, the lifecycle, and the failure modes to watch.

Harry Parker
Co-founder, Onviqa Inc. · Surfient
prompt-library
TL;DR
  • A keyword list carries 4 columns (keyword, volume, difficulty, CPC) optimised for single-turn typed SERPs — none of those columns tell you what to do in a multi-turn AI retrieval world.
  • The replacement is a 7-column prompt library (prompt, intent, buyer stage, engine, turn position, corroboration needs, citation target page) maintained on a weekly discover-sample-measure-respond cycle — ~6 hours of team time at 186-prompt scale.
  • The conversion ratio when migrating is ~1 keyword → 0.7 prompts after grading, so your first library will be smaller than the list you’re replacing — but every row will carry 10x the decision-making value.

The keyword list was the best tool we had when search was a single-turn typed query. It’s a bad fit for AI retrieval, where a single buyer session spans three to five turns, crosses engines, and resolves through corroboration across multiple sources. The replacement artefact is a prompt library — a living document with seven columns per row and a weekly maintenance cycle. This post covers the schema, the lifecycle, and the common failure modes in the transition.

Why keyword lists fail the AI retrieval test

A keyword list carries four columns: keyword, volume, difficulty, cost-per-click. That schema is sufficient when your goal is to rank one URL for one query on one engine. None of those assumptions hold in AI retrieval. The “query” is a multi-turn conversation. The “URL” is one of several sources the engine synthesises. The “engine” is plural — Perplexity, AI Mode, ChatGPT, Claude all behave differently. And the “rank” is a probability of appearing in a citation shortlist, not a visible position 1-10.

The second problem is more insidious: keyword lists implicitly encode that volume is the right prioritisation signal. In AI retrieval, volume is noisy. A 300-volume comparison prompt with 82% purchase intent outperforms a 12,000-volume informational prompt with 5% intent by roughly 40x in attributable pipeline per citation. Volume-driven prioritisation misses where the money actually flows.

Two-column comparison. Left column red: traditional keyword list with four columns (keyword, monthly volume, difficulty, CPC). Below, six reasons it fails for AI retrieval including no intent signal, no turn position, no buyer stage, no engine specificity, no corroboration map, no citation target. Right column cyan: prompt library with seven columns (prompt, intent, buyer stage, engine, turn position, corroboration needs, citation target page). Below, five reasons it works for AI retrieval.
The keyword list schema (left, 4 columns, volume-driven) versus the prompt library schema (right, 7 columns, intent + turn + stage driven). The right side carries the context AI retrieval actually uses.

The seven columns of a prompt library

Our working prompt library template has seven columns because each one drives a different downstream decision. Drop any column and you lose optionality on something meaningful.

Prompt. The natural-language question exactly as a buyer would phrase it. Not keyword-ified. Full sentences with the buyer’s context (“under $800”, “for kids’ room”, “with two cats and a toddler”). These details matter because engines index the full surface.

Intent. Research / comparison / high-intent shortlist / post-purchase / objection. Intent decides what kind of page wins the citation. Research prompts win on long-form guides; shortlist prompts win on collection pages with AggregateRating; objection prompts win on FAQPage content.

Buyer stage. Where in the funnel the buyer is when this prompt fires. Awareness, consideration, comparison, decision, post-purchase. A single prompt can fire at multiple stages, but most have a dominant one.

Engine. Which engine(s) this prompt is most likely to fire on. Commerce prompts cluster on Perplexity Shopping and AI Mode; research prompts cluster on ChatGPT and Claude. Engine column tells you where to measure first.

Turn position. Turn 1 / turn 2 / turn 3+. Turn 1 opens the session (“best wool rug for X”). Turn 2 narrows (“between Y and Z, which is better for…”). Turn 3 handles objections (“is it safe for…”). Different page types win different turns.

Corroboration needs. Where else must the answer appear for the engine to trust it? Reddit thread, buying guide, review platform, expert publication. This column drives off-site content work and PR.

Citation target page. The specific URL on your site that should win the citation for this prompt. Usually a PDP, collection, FAQ, or long-form guide. Making this explicit enables measurement: you can check whether the engine cited the intended page or a different one, and adjust.

The four-stage weekly lifecycle

A prompt library isn’t a one-off artefact — it’s a living document maintained on a weekly cadence. Treated as a one-off, it becomes a keyword list wearing a better schema. Treated as a living cycle, it becomes the operational nerve centre of your GEO programme.

Four-stage weekly prompt library lifecycle. Stage 01 Discover on Monday AM for 1 hour, sources include sales call transcripts, support tickets, Reddit, customer interviews; output 5-12 new prompt rows. Stage 02 Sample on Tuesday for 2 hours, runs prompts across Perplexity, Google AI Mode, ChatGPT, Claude; output presence and rank per prompt per engine. Stage 03 Measure on Wednesday-Thursday for 2 hours, computes citation share, week-over-week deltas, competitor comparison, stage and turn segmentation; output dashboard update and anomaly alerts. Stage 04 Respond on Friday for 1 hour, prioritises new citation-target pages, schema upgrades, corroboration work, prompt retirements; output next-sprint queue.
The weekly prompt-library lifecycle. Monday: Discover from sales, support, Reddit, interviews. Tuesday: Sample across engines. Wednesday-Thursday: Measure citation share, diffs, competitor comparison. Friday: Respond with next-sprint priorities for content, schema, and off-site work. ~6 hours of team time per week at 186-prompt scale.

Monday — Discover

Surface new candidate prompts from the places your buyers ask real questions. Sales call transcripts are the richest source: the questions prospects ask in discovery calls map directly to turn-1 and turn-2 prompts they’d put to an AI. Support tickets surface objection prompts (turn-3). Reddit subreddits in your category surface the comparison prompts your buyers are actually posting. The discipline: add 5-12 new prompts per week; fill all seven columns on each; don’t add “obvious” prompts that don’t match a real buyer voice.

Tuesday — Sample

Run every prompt in the library across the four primary engines in a consistent order, capturing presence (cited or not) and rank (which citation slot if cited). Tooling ranges from manual (a VA with a script and a spreadsheet) to fully automated (Surfient runs this continuously and surfaces the deltas). Either way, the output is a fresh presence matrix.

Wed-Thu — Measure

Aggregate into citation share numbers at three granularities: overall, per-engine, per-intent-stage. Diff against last week and flag anything shifting by 5 points or more. Compare against your top 3 category competitors on the same cohort. Summarise findings in a dashboard update that goes to GEO lead, content lead, and head of merchandising.

Friday — Respond

Turn findings into sprint tickets. Prompts with strong intent but weak citation share get new content work. Pages that are winning citations but only on high-cost engines get prioritised for schema work. Prompts that have lost citation share get corroboration tickets (new Reddit engagement, guide placements, PR outreach). Prompts that have been dead for 30+ days get retired to archive. The output is a clean, prioritised queue for content and dev to pick up next week.

Common failure modes in the transition

Failure mode #1: keyword list with renamed columns. Team renames “keyword” to “prompt” and “volume” to “priority score” but populates the new columns with the same logic. Result: same behaviour, new spreadsheet. Cure: enforce the schema discipline that prompts must be full sentences with buyer context, not keyword phrases.

Failure mode #2: skipping the Respond stage.Teams measure diligently but never convert findings into content or dev sprint work. The prompt library becomes a dashboard rather than an operational artefact. Cure: treat the Friday Respond session as the most important meeting of the week; if nothing changes in the next sprint, the library isn’t working.

Failure mode #3: library bloat. Every prompt ever discovered stays forever, even when citation share is persistently zero and nothing has changed in the market. Library grows to 900+ rows and the weekly cadence becomes unsustainable. Cure: retire prompts aggressively; archive anything at zero citation share for 30+ days with no active content fix scheduled.

Failure mode #4: single-engine sampling.Team tracks Perplexity only because it’s the easiest to automate. Misses shifts on AI Mode or ChatGPT that would change priority. Cure: all four primary engines every week, even if the tooling is manual on two of them initially.

  • Kill the volume column. It’s an attractor for old habits. Replace with intent + buyer stage. If you miss a traffic metric, add estimated-monthly-traffic to the citation-target page, not the prompt.
  • Every prompt needs a citation-target page. “Any of our PDPs” isn’t a citation target. Pick the specific URL you want the engine to cite, and measure against it.
  • Full sentences, not phrases. “best wool rug 8x10 for high-traffic hallway with two cats” is a prompt. “wool rug 8x10 pets” is a keyword phrase. They behave totally differently in AI retrieval.
  • Include buyer voice prompts from calls and support. Not just your marketing team’s imagined prompts. Customer language reliably out-performs internal guesses.
  • Treat the Friday Respond session as the point of the cycle. If nothing changes in sprint planning, the library is decorative and should be killed rather than maintained.

Closing — the artefact shapes the work

The deepest cost of keeping a keyword list after moving to GEO isn’t the list itself — it’s that the artefact shapes the conversations your team has. Weekly meetings organised around keyword volume debate whether to target a 12K-volume keyword or a 3K-volume keyword. Weekly meetings organised around a prompt library debate whether to add a FAQ to a collection page or to seed a Reddit thread with buyer language. Different artefact, different strategy surface, different outcomes.

Tags:promptskeywordsgeoprocessmeasurement

Frequently asked questions

Try Surfient free

See how your Shopify store scores with AI engines

Surfient audits every signal ChatGPT, Perplexity, Claude, and Google AI Overviews read on your store — in under 60 seconds, with no install, no card, no catch.

  • ChatGPT, Perplexity, Claude, and AI Overviews
  • Store-by-store score with fix priorities
  • 60-second audit, no install or card

Sources & further reading

  1. Surfient prompt-library template v2.1 — schema + sample fills
    Surfient Research2026-02-12
Harry Parker
Co-founder, Onviqa Inc. · Surfient

Harry has led SEO and e-commerce engineering for over 12 years and has been shipping Shopify software since Onviqa was founded in 2014. He writes about where commerce is headed when shoppers stop typing queries and start asking assistants.

Related reading

All posts