The keyword list was the best tool we had when search was a single-turn typed query. It’s a bad fit for AI retrieval, where a single buyer session spans three to five turns, crosses engines, and resolves through corroboration across multiple sources. The replacement artefact is a prompt library — a living document with seven columns per row and a weekly maintenance cycle. This post covers the schema, the lifecycle, and the common failure modes in the transition.
Why keyword lists fail the AI retrieval test
A keyword list carries four columns: keyword, volume, difficulty, cost-per-click. That schema is sufficient when your goal is to rank one URL for one query on one engine. None of those assumptions hold in AI retrieval. The “query” is a multi-turn conversation. The “URL” is one of several sources the engine synthesises. The “engine” is plural — Perplexity, AI Mode, ChatGPT, Claude all behave differently. And the “rank” is a probability of appearing in a citation shortlist, not a visible position 1-10.
The second problem is more insidious: keyword lists implicitly encode that volume is the right prioritisation signal. In AI retrieval, volume is noisy. A 300-volume comparison prompt with 82% purchase intent outperforms a 12,000-volume informational prompt with 5% intent by roughly 40x in attributable pipeline per citation. Volume-driven prioritisation misses where the money actually flows.

The seven columns of a prompt library
Our working prompt library template has seven columns because each one drives a different downstream decision. Drop any column and you lose optionality on something meaningful.
Prompt. The natural-language question exactly as a buyer would phrase it. Not keyword-ified. Full sentences with the buyer’s context (“under $800”, “for kids’ room”, “with two cats and a toddler”). These details matter because engines index the full surface.
Intent. Research / comparison / high-intent shortlist / post-purchase / objection. Intent decides what kind of page wins the citation. Research prompts win on long-form guides; shortlist prompts win on collection pages with AggregateRating; objection prompts win on FAQPage content.
Buyer stage. Where in the funnel the buyer is when this prompt fires. Awareness, consideration, comparison, decision, post-purchase. A single prompt can fire at multiple stages, but most have a dominant one.
Engine. Which engine(s) this prompt is most likely to fire on. Commerce prompts cluster on Perplexity Shopping and AI Mode; research prompts cluster on ChatGPT and Claude. Engine column tells you where to measure first.
Turn position. Turn 1 / turn 2 / turn 3+. Turn 1 opens the session (“best wool rug for X”). Turn 2 narrows (“between Y and Z, which is better for…”). Turn 3 handles objections (“is it safe for…”). Different page types win different turns.
Corroboration needs. Where else must the answer appear for the engine to trust it? Reddit thread, buying guide, review platform, expert publication. This column drives off-site content work and PR.
Citation target page. The specific URL on your site that should win the citation for this prompt. Usually a PDP, collection, FAQ, or long-form guide. Making this explicit enables measurement: you can check whether the engine cited the intended page or a different one, and adjust.
The four-stage weekly lifecycle
A prompt library isn’t a one-off artefact — it’s a living document maintained on a weekly cadence. Treated as a one-off, it becomes a keyword list wearing a better schema. Treated as a living cycle, it becomes the operational nerve centre of your GEO programme.

Monday — Discover
Surface new candidate prompts from the places your buyers ask real questions. Sales call transcripts are the richest source: the questions prospects ask in discovery calls map directly to turn-1 and turn-2 prompts they’d put to an AI. Support tickets surface objection prompts (turn-3). Reddit subreddits in your category surface the comparison prompts your buyers are actually posting. The discipline: add 5-12 new prompts per week; fill all seven columns on each; don’t add “obvious” prompts that don’t match a real buyer voice.
Tuesday — Sample
Run every prompt in the library across the four primary engines in a consistent order, capturing presence (cited or not) and rank (which citation slot if cited). Tooling ranges from manual (a VA with a script and a spreadsheet) to fully automated (Surfient runs this continuously and surfaces the deltas). Either way, the output is a fresh presence matrix.
Wed-Thu — Measure
Aggregate into citation share numbers at three granularities: overall, per-engine, per-intent-stage. Diff against last week and flag anything shifting by 5 points or more. Compare against your top 3 category competitors on the same cohort. Summarise findings in a dashboard update that goes to GEO lead, content lead, and head of merchandising.
Friday — Respond
Turn findings into sprint tickets. Prompts with strong intent but weak citation share get new content work. Pages that are winning citations but only on high-cost engines get prioritised for schema work. Prompts that have lost citation share get corroboration tickets (new Reddit engagement, guide placements, PR outreach). Prompts that have been dead for 30+ days get retired to archive. The output is a clean, prioritised queue for content and dev to pick up next week.
Common failure modes in the transition
Failure mode #1: keyword list with renamed columns. Team renames “keyword” to “prompt” and “volume” to “priority score” but populates the new columns with the same logic. Result: same behaviour, new spreadsheet. Cure: enforce the schema discipline that prompts must be full sentences with buyer context, not keyword phrases.
Failure mode #2: skipping the Respond stage.Teams measure diligently but never convert findings into content or dev sprint work. The prompt library becomes a dashboard rather than an operational artefact. Cure: treat the Friday Respond session as the most important meeting of the week; if nothing changes in the next sprint, the library isn’t working.
Failure mode #3: library bloat. Every prompt ever discovered stays forever, even when citation share is persistently zero and nothing has changed in the market. Library grows to 900+ rows and the weekly cadence becomes unsustainable. Cure: retire prompts aggressively; archive anything at zero citation share for 30+ days with no active content fix scheduled.
Failure mode #4: single-engine sampling.Team tracks Perplexity only because it’s the easiest to automate. Misses shifts on AI Mode or ChatGPT that would change priority. Cure: all four primary engines every week, even if the tooling is manual on two of them initially.
- Kill the volume column. It’s an attractor for old habits. Replace with intent + buyer stage. If you miss a traffic metric, add estimated-monthly-traffic to the citation-target page, not the prompt.
- Every prompt needs a citation-target page. “Any of our PDPs” isn’t a citation target. Pick the specific URL you want the engine to cite, and measure against it.
- Full sentences, not phrases. “best wool rug 8x10 for high-traffic hallway with two cats” is a prompt. “wool rug 8x10 pets” is a keyword phrase. They behave totally differently in AI retrieval.
- Include buyer voice prompts from calls and support. Not just your marketing team’s imagined prompts. Customer language reliably out-performs internal guesses.
- Treat the Friday Respond session as the point of the cycle. If nothing changes in sprint planning, the library is decorative and should be killed rather than maintained.
Closing — the artefact shapes the work
The deepest cost of keeping a keyword list after moving to GEO isn’t the list itself — it’s that the artefact shapes the conversations your team has. Weekly meetings organised around keyword volume debate whether to target a 12K-volume keyword or a 3K-volume keyword. Weekly meetings organised around a prompt library debate whether to add a FAQ to a collection page or to seed a Reddit thread with buyer language. Different artefact, different strategy surface, different outcomes.