4.3 Prompt taxonomy, Fragility and Volatility

The quality of your GEO measurement stands or falls with your prompt set. A poor prompt set produces unreliable numbers that don’t form a basis for strategic decisions. This page describes how to categorise prompts — and introduces two important stability metrics: Fragility and Volatility.

The four prompt categories

Direct brand questions

Prompts where your brand name appears explicitly. Examples: “How do you rate X?”, “Is Y reliable?”. These measure the base knowledge and sentiment AI has about your brand.

Category questions

Prompts about your product category without brand mention. Examples: “Best mortgage for first-time buyers”, “Which supplier of first-aid materials for Belgian businesses?”, “Which CRM is most suitable for consultants?”. This is where you measure real orientation position.

Problem-driven questions

Prompts from a user situation. Examples: “I’m buying my first home, how do I start?”, “My invoice hasn’t been paid, what now?”. The type of questions users actually come with.

Comparative questions

Prompts that explicitly ask for multiple players side by side. Examples: “Compare X and Y for car insurance”. Important for decision-stage content.

Prompt design principles

  • Write prompts the way a user would phrase them, not the way a marketer would
  • Vary phrasing of the same intent to test robustness
  • Cover all relevant intent types and user situations
  • Document the prompt set and keep it consistent across measurements
  • Update the prompt set periodically based on evolving search behaviour

Stability measures

Fragility Index

The Fragility Index measures how vulnerable your visibility is to small changes in prompt phrasing. If “which installer of automatic kitchen fire suppression systems in Flanders?” mentions you, but “providers of kitchen fire suppression systems for restaurants in Belgium” does not, your presence is fragile.

High Fragility means your visibility depends on chance phrasings. Low Fragility means you appear consistently across variants of the same question — a sign of strong underlying entity presence.

Measure Fragility by formulating each core question in at least three variants. If you appear in all three: robust. If in one out of three: fragile.

Volatility Score

The Volatility Score measures how stable your position is across repeated measurements of identical prompts. AI models are not deterministic — the same question, asked at different moments, can yield different answers.

Run identical prompts multiple times (minimum three, preferably five to ten) and observe how often you are mentioned. Five out of five: stable. One out of five: volatile. Both are useful metrics, but they say something fundamentally different.

High volatility is not automatically a problem. It can also signal an open market where no brand is yet dominantly anchored in the training data — an entry opportunity, not a defensive issue. Always interpret volatility in combination with your average presence and the concentration of competitors. Low volatility with high presence of one competitor is an established market that you must attack. High volatility with diffuse presence is a market in motion where you can structurally win.

High volatility on core prompts is often a red signal: your visibility depends more on random variation than on structural presence. Address it.

Using them together

Fragility tells you how robust your position is across phrasing. Volatility tells you how stable your position is over time. The strongest GEO performance combines low Fragility (you appear regardless of phrasing) with low Volatility (you appear consistently over time). The most concerning is the opposite: high Fragility plus high Volatility means your visibility is essentially noise.

Related in the hub

→ How do you turn all this measurement data into a usable report? Read 4.4 — Benchmark reporting.