GSI

How it works

The phases of running a governed AI system on your store

Eight phases — from capturing your store’s voice, through structuring the catalog, defining tasks, refining standards, rolling out across products, approving changes, and building a compounding knowledge base across cycles.

One clarification first

The AI in this system is the worker doing the operational work — researching, auditing, drafting, validating — not the audience your store is optimised for. Your customers and traditional search-engine crawlers consume what gets published; AI is internal to how the work gets done. Optimising your catalog explicitly for AI-search retrieval (SGE, ChatGPT search, Perplexity) is a viable future task category — not a current focus.

The defining distinction

Don’t ask AI to do the work. Ask it to set up the work.

A subtle but defining distinction. It’s the difference between 500 unique inconsistent edits and 500 consistent traceable edits running through the same governed pipeline.

The generic approach

Tell a generic MCP-style AI to “edit these 500 product descriptions”. You get 500 different edits — different voice, different length, different keyword strategy. No way to know which worked. No way to roll back the bad ones. No way to learn from the result.

Our approach

You ask the AI Executive to set up the work. Once the SOP, prompt, and standards are clear, the AI Employees run it across all 500 — consistent, auditable, measurable, reversible.

Two layers, two jobs

The AI doesn’t do everything. It does two specific things, in two specific roles.

Layer 1 · AI Executive + Researcher

Setup, strategy, standards

Works with you and us. Asks the strategic questions. Conducts research. Produces SOPs, prompts, and validation rules that capture what good looks like for your store. The Executive does not edit products — it defines the rules that make every future edit consistent.

Layer 2 · AI Employees

Execution within the boundaries

Once a clear SOP exists for a task in a category, the Employees run it page by page across every product in scope. Generate proposals against the SOP. Track every action with full provenance. Route what passes validation through to your approval queue.

The Executive thinks. The Employees do — within the boundaries the Executive helped you set.

A typical exchange with the AI Executive

“How should we improve the meta descriptions for this category?”

The Executive works with you to:

  • Define what a good meta description looks like for this specific category
  • Build the prompt that will generate them consistently
  • Set the validation rules — character limits, brand voice, SEO impact, scope
  • Configure the approval threshold and review flow
  • Capture all of that as a versioned SOP

You review the setup. Tweak the prompt. Confirm the standards. Then the AI Employees take that SOP and run it across all 500 products in the category — every generation traceable to the exact SOP and prompt version. Every change rollback-able. Every prompt improvable — and improvements compound across the next 500.

Honest current state

The chat-driven orchestration described above is partly built and partly our north star. The phases below describe the system that exists today; the conversational layer that lets you drive more of the setup yourself is rolling out through 2026. Until then we drive setup with you — and you drive progressively more of it through chat as the layer matures.

The payoff

When the output is wrong, you fix the input.

Every AI Employee output is traceable to a specific SOP and prompt version. So when something doesn’t land — a description that’s off-brand, a meta tag that misses the angle, a title that drifts in voice — you know exactly where the gap is. And you can fix it at the source.

Option 1 · Edit the SOP directly

Open the SOP. Refine the guidance — add the constraint, the example, the rule that was missing. Save the new version. The next run picks it up automatically.

Option 2 · Ask the AI Executive

“These descriptions aren’t capturing the heritage angle for this category — can you draft updated guidance?” The Executive proposes the refinement. You review and approve it before it goes live.

Either way, you’ve fixed the input— and every product processed against that SOP from now on inherits the improvement. The fix isn’t a one-off retry. It’s a system change. A persistent capability that compounds across the rest of the catalog and across every future cycle.

The persistence payoff

In six months, the underlying LLM may be different. We might have moved across versions of Claude, GPT, or Gemini — or to whatever’s released after this page was written. We might use different models for different tasks. The models change. Your SOPs don’t.

Your store’s standards, prompts, and accumulated learnings live in your configuration — in versioned SOPs, prompt snippets, and validation rules — not inside whatever model is currently running them.

That’s the difference between “AI helped me edit some products last quarter” and “the operational discipline of my store has measurably improved — and stays improved.”

The contrast

Most AI tools store prompt knowledge in model context windows that evaporate between sessions and break when models change. We store it as versioned configuration that persists. The store gets smarter at running itself — and that smarter-ness lasts, regardless of which LLM is doing the lifting next quarter.

Every engagement adapts

The phases below are consistent. The intensity is not. Some operators want to be hands-on through every cycle — reviewing proposals in detail, refining prompts with us, shaping the SOPs as they grow. Others prefer to set the rules early and let the system run, reviewing outputs at their own pace. We design the engagement around how involved you want to be. The system is the same; how you touch it varies.

Phase 01

Capture the store context, voice, and rules

Before any AI touches the store, we capture what makes it specific — brand voice, audience, tone, regulated language, and words to avoid.

What happens

  • Extract voice patterns from existing high-performing content
  • Document target audience descriptors and tone variations by category
  • Capture blacklisted terms, regulated language, and brand vocabulary
  • Set the foundational AI configuration the rest of the system inherits from

Outcome: Your store’s AI configuration v1 — the foundation every downstream task inherits from.

Phase 02

MECE structure + ABC analysis

Restructure (or set up) product categorisation as a MECE taxonomy so AI can reason at the category level. Layer ABC analysis on top so AI budget goes where revenue is.

What happens

  • Map your existing categories into a MECE taxonomy where it improves clarity
  • Build the category tree the system uses to inherit rules and prompts
  • Run ABC analysis — identify A (top revenue), B (mid), C (long-tail) products
  • Configure per-tier AI handling: heavier treatment for A products, lighter for C

Outcome: A category structure that scales rules without hand-tuning each product, and an ABC tiering that allocates AI budget where revenue lives.

Phase 03

Set up MCP for chat, research, and reporting

Bound to read-only by architecture. Plain-English chat with your store data, AI-assisted research, executive reporting — none of it has any write path. The only way changes reach the store is through the approval pipeline.

What happens

  • Configure MCP servers for the store, GA4, and Search Console
  • Enable read-only chat scoped to your warehouse
  • Set up research pipelines: competitor analysis, keyword expansion, page-intent
  • Wire up reporting and executive summaries

Outcome: AI tooling you can use today, scoped to read-only by design. Investigations, research, and reports without any risk of accidental writes.

Phase 04

Define the task library

Every kind of work the system performs is a defined task with its own prompt template, validation rules, and approval flow.

What happens

  • SEO meta fields, product titles, product descriptions, alt text
  • Data consistency audits across custom and meta fields
  • Google Ads title and description generation
  • Research tasks: competitor, keyword, intent, gap analysis
  • AI-generated lifestyle imagery briefs and assets
  • Each task gets its own prompt template, validators, approval threshold, and scoped write mapping (one task, one field)

Outcome: A defined library of work the system can do — every task observable, governable, and replayable.

Phase 05

Refine standards from priority categories

Start where it matters most. Define what good, bad, and expected look like for each task within a priority category. The AI learns from those standards and applies them across the rest of the catalog.

What happens

  • Pick the highest-impact categories first — usually A-tier revenue
  • For each task in each category, define what good output looks like (and what bad looks like)
  • Build or integrate SOPs that capture those standards
  • Refine prompt snippets so the standards are encoded, not just described

Outcome: A growing library of SOPs and prompt snippets that lets the AI understand what good means for your store specifically.

Phase 06

Roll out across products — fully observable

Once standards are set for a category, the system runs across every product in it. Each product gets a quality score, audit findings, and improvement candidates. Every run is fully observable.

What happens

  • Scheduled audit runs per category, generating per-product quality scores
  • Surface findings and improvement candidates into the recommendations queue
  • Every task run logged: exact prompt version, config version, model used, rating returned, reasoning trail, recommendation
  • Anything questionable can be inspected and traced back to its inputs

Outcome: A continuously-updated view of every product’s quality state, with traceable recommendations ready for approval.

Phase 07

Governed approvals — applied to your store

Recommendations enter the four-gate flow: Propose → Validate → Review → Apply. Approved changes write only the scoped field. Every change logged. Every change reversible.

What happens

  • Bulk review with diff preview and sample-first approval
  • Approve in batches, reject individuals, or request changes
  • Approved proposals apply through scoped write APIs — never beyond the approved field
  • Append-only ledger with full provenance and one-click rollback per change or per batch

Outcome: Approved changes live on your store with complete audit trail, attribution, and rollback. Performance tracking begins automatically the moment a change applies.

Phase 08

Continuous cycle — knowledge compounds

Each cycle builds the store-specific knowledge: which prompts worked, which audits matter, which standards need to evolve. The knowledge persists across model changes and engagement pauses.

What happens

  • Outcomes tracked per change, per prompt, per category: rankings, engagement, impressions, conversions
  • SOPs and prompt snippets refined based on what worked
  • New audits and standards added as patterns emerge
  • Knowledge carries forward when underlying LLMs change — your store-specific tuning is in the SOPs, not the model

Outcome: Institutional store knowledge that survives staff changes, platform updates, and LLM upgrades. The longer the system runs, the more specifically tuned it is to your store.

Common questions

When do I start seeing results on the store?

Approved changes apply to the store from Phase 7 onward — once foundational configuration is in place and the first priority category has been refined. Real category-level outcome attribution requires several cycles of changes to produce meaningful data; ranking and engagement deltas need indexing and traffic to settle, so trustworthy outcome reads always lag the changes themselves.

What if the AI suggests something I disagree with?

You reject it. Every proposal is reviewable before anything reaches your store, and rejection is one click. Rejected proposals get logged with the reason — that feeds back into prompt refinement in Phase 8 so the same mistake doesn't repeat.

Do I have to migrate platforms or change my store setup?

No. We integrate with your existing Shopify or Neto store as-is. We never ask for write access beyond explicit, scoped API calls for approved changes. If you ever want to move to a different platform, your store data stays exactly where it is — we don't lock you in.

What happens if I want to pause the engagement?

We pause the operational cadence — no new audits, no new proposals — but the ledger, approval history, SOPs, and store knowledge stay intact. When you resume, the system picks up from where it left off, not from scratch.

Can I scale this to a second or third store?

Yes. Per-store engagements with shared store-group-level rules where it makes sense. Agencies running multiple stores get particular value because the same playbooks compound across the portfolio — what we learn from your first store accelerates your second.

Do I log in and use the platform myself, or is it all run by you?

Both — you have full login access from day one. The platform is fully yours to use: chat over your store data through the MCP layer, request and receive reports (executive summaries, audits, custom performance views) on schedule or on demand, view performance dashboards covering rankings, engagement, conversions, and per-category attribution, review and approve proposals, edit prompts, and contribute to the SOPs and documentation that govern future work. What we lead is the initial structural setup — your MECE taxonomy, voice configuration, validation rules, and first task SOPs — to make sure the foundations get built properly. After that, day-to-day operation is something you do, with us alongside. A pure self-serve onboarding flow (operating from scratch without our guidance) is rolling out through 2026.

Ready to see what Phase 1 looks like for your store?

We’re taking on a small number of founding stores. Pace, depth, and intensity all adapt to how involved you want to be.