The Catalog Quality Curator

Very large organization running a modern service portal faces the same quiet crisis. The technology is capable — agentic AI can understand natural language, match intent to services, pre-fill forms from context, and route requests to the right team in seconds. But none of that works if the services themselves are described in ways that only the support team understands. We built our My Support & Services portal on BMC Helix Digital Workplace with genuine ambition. Then we launched HelixGPT 25.4 and discovered, in measurable detail, just how wide the gap between platform capability and content quality really was.

What followed was not a technology project. It was a content governance reckoning — and out of it came something we did not expect: an agentic AI agent that teaches other AI agents how to talk to humans.

1,800+catalog items across 7 business domains5governed quality fields per service2fundamental service flows: Report & Request

The Catalog Was Built for Us, Not for Them

When we audited our catalog seriously — and "seriously" meant scoring every item against a defined quality framework — the pattern was unmistakable. Services were named for internal teams, not for the employees who needed them. Descriptions answered the question "what does this form create in ITSM?" rather than "what problem does this solve for you?" Keywords were either missing entirely or stuffed with generic terms like "support", "help", and "not working" that matched hundreds of services simultaneously and therefore matched none of them usefully.

This is not a failure of effort. Every service was built by someone who knew their domain, knew their process, and knew what they needed the ticket to contain. They built for the resolver. Nobody asked them to build for the requester — because until HelixGPT, the requester was expected to browse a menu and figure it out.

The core insight

A catalog built for support teams produces support-team-readable content. HelixGPT reads the same content an employee reads. If the employee cannot understand it, neither can the AI.

HelixGPT changed the equation entirely. The AI does not browse menus. It reads service names, descriptions, and keywords the same way a new employee would — in plain language, looking for a match to what they actually said. When an employee types "I can't get into the finance system," HelixGPT searches for a service whose content resonates with that expression. If the closest match is titled "SAP-FIN-ACC-001 Access Form" with a description that reads "Submit this form to raise a SRM ticket for SAP Finance module entitlement provisioning" — the AI either guesses wrong, presents an unhelpfully long list of ambiguous options, or escalates to a live agent.

The failure, in every one of those cases, is not in the AI. It is in the content the AI has to work with.

The most important thing we learned is that AI readiness is not a technology decision. It is a content decision. You can deploy the most sophisticated agentic AI platform on the market and watch it underperform because the catalog it reads was written for a different era.

MSS Product Owner · My Support & Services

Two Verbs to Rule Them All

Before we could fix individual services, we needed to solve a more fundamental problem: there was no consistent mental model for what a catalog service actually is. Some services were incident forms. Some were procurement requests. Some were HR entitlement processes. Some were pure information requests. They were all described differently, categorised inconsistently, and named using whatever convention the Service Owner happened to prefer on the day they built the form.

The breakthrough was deceptively simple. Every employee interaction with a service catalog falls into one of two fundamentally different journeys, driven by a fundamentally different mental state.

REPORTSupport Flow · Detect → Correct

The employee has a problem, an incident, or a question. Something has gone wrong or they need guidance. They are reacting. They did not plan to be here. The service must meet them where they are — with empathy, clarity about what happens next, and a fast path to resolution.

REQUESTValue Flow · Request → Fulfill

The employee wants something they do not currently have. Access, hardware, a license, an entitlement, a booking. They are acting deliberately. They know what they want. The service must be clear about eligibility, approval, and delivery timeline — and get out of their way.

Two verbs. Two flows. Every one of our 1,800 services fits into one of them — and the classification is never ambiguous once you apply a simple decision tree. Is the employee reacting to a problem? REPORT. Are they proactively getting something? REQUEST. If a service genuinely spans both, it is almost certainly two services that were never properly separated.

This is not bureaucracy. It is precision engineering for AI intent recognition. HelixGPT does not read taxonomy. It reads language. A service that begins with REPORT tells the AI — before a single word of the description is processed — that this is a support interaction. The routing, the SLA expectation, and the tone of the entire service description follow from that single word. At 1,800 services and growing, consistency of signal is not a preference. It is a technical requirement.

The Five Fields That Make a Service Findable

Once we had the two-flow model, we could define what quality actually means for a catalog service. Not subjectively — not "it sounds professional" or "the team is happy with it" — but measurably, against a scoring framework that produces a Catalog Quality Score (CQS) out of 100.

Five fields. No more. We were deliberate about keeping the scope tight because governance that covers twenty fields is governance that nobody follows. Five fields that actually determine whether HelixGPT can find, understand, and suggest a service accurately:

The five governed fields

Service Name (15 pts) — begins with REPORT or REQUEST, plain English, under 80 characters, no internal codes.

Short Description (30 pts) — answers four questions: what is this for, who should use it, when should they use it, and what happens after they submit.

Keywords & Tags (25 pts) — 5–10 domain-specific terms only. System names, role synonyms, service-specific terminology. No generic terms that match hundreds of services simultaneously.

Form Question Labels & Help Text (20 pts) — every non-obvious question has plain-language guidance explaining what to enter, why it is needed, and where to find it.

Category & Subcategory Mapping (10 pts) — correctly placed in the approved four-level taxonomy: Domain → Category → Subcategory → Service.

The short description carries the highest weight — 30 points — because it is the primary text HelixGPT reads to understand what a service does. A service with a perfect name and keywords but a weak description is a service the AI will still get wrong.

Keywords, counterintuitively, carry less weight than description — and their standard is deliberately restrictive. We ban generic terms outright. No "not working." No "broken." No "I need." No "help." These words appear in hundreds of services. When a keyword matches 600 services equally, it narrows nothing. It is noise. The keyword field exists to surface terms that are unique to a specific service — the product name, the internal nickname, the role-specific jargon that a particular user community would type when they need exactly this service.

85–100ptsGold — HelixGPT Ready

Fully optimised. HelixGPT can find, suggest, and pre-fill this service accurately. No action required.

65–84ptsSilver — Findable

Discoverable via search but not reliably suggested by HelixGPT. Improvement recommended within the current PI.

40–64ptsBronze — At Risk

Significant discoverability gaps. High risk of wrong form selection and misrouting. Transformation required.

0–39ptsCritical — Invisible

Effectively invisible to both search and HelixGPT. Primary source of misrouted tickets. Priority transformation.

An AI Agent That Improves the Catalog for AI Agents

Here is where the project became genuinely interesting. We had a framework. We had a scoring model. We had 1,800 services to transform and a handful of Catalog Admins to do it. The arithmetic was brutal. Even at fifteen minutes per service, a single Catalog Admin working full time would need eighteen months to get through the backlog — and the backlog keeps growing as new services are added.

The solution was to build an agentic AI agent — the Catalog Quality Curator — that does the drafting work so humans can focus on the judgment work.

The Curator is a HelixGPT agent built in Innovation Studio that a Service Owner can invoke in two ways: directly, when they want to improve or create a service, or automatically, via a bi-annual review email that reminds every Service Owner to acknowledge, update, or retire each service they own. When a Service Owner opens a session, the Curator reads the live catalog data for their service, scores it against the CQS framework, explains every gap in plain language, and proposes specific improvements — drafting the service name, description, keywords, and form help text from scratch where needed.

How a session works

The Curator never asks more than one question at a time. It classifies the flow first, verifies the Service Owner's identity, checks pre-requisites for new services (backend type, assignment group), walks through all five fields, resolves gaps interactively, and presents a Final Preview — the service exactly as it will appear in the MSS portal — before asking for approval. The Service Owner types APPROVE, CHANGE, or FLAG. On approval, a locked improvement report goes to the Catalog Admin queue for implementation.

The Curator does not apply changes directly. This is intentional. The Service Owner reviews and approves every proposed change. The Catalog Admin implements it. The AI does the drafting — the hardest and most time-consuming part — while humans retain authority over what goes live. This is not a limitation. It is a governance choice, and it is the choice that makes the whole system trustworthy.

What We Learned About AI-Ready Content at Scale

Building this system taught us things about enterprise AI adoption that no platform vendor had articulated clearly. We want to share three of them.

1. AI readiness is a content problem, not a technology problem

The organisations that will get the most from agentic AI in the next three years will not be the ones that deploy the most sophisticated models. They will be the ones that invest in content quality before the AI arrives. The model is already good enough. The content is almost never good enough. Fixing that gap after deployment — as we are doing — is harder than fixing it before.

2. Governance without enforcement is decoration

We tried informal guidance for years. "Please write clear descriptions." It did not work — not because Service Owners did not care, but because there was no standard to write to, no score to improve, and no consequence for not improving. The CQS framework works because it produces a number, and numbers drive conversations that prose cannot. A service scoring 23 out of 100 is a conversation that happens. "Your description could be clearer" is a conversation that does not.

3. The two-verb model scales in ways that complex taxonomies do not

We considered more elaborate classification systems. Seven service types. Fourteen interaction patterns. Every additional category added governance overhead without adding proportional signal value. Two verbs — REPORT and REQUEST — turn out to be sufficient because they represent the only distinction that actually changes downstream behavior: the backend process, the SLA type, the HelixGPT routing logic, and the tone of the service description. Everything else is detail that belongs inside the service, not in its classification.

The Bi-Annual Review Cycle: Keeping Quality Alive

Transformation is not enough. A catalog that reaches Gold tier in 2026 will drift back toward Bronze by 2028 if there is no mechanism to maintain it. Services get deprioritised. Systems change. SLAs are updated. Service Owners move teams. The content goes stale while the catalog keeps serving it.

The answer is a governance rhythm. Every six months, every active Service Owner receives a personalised actionable email for each service they own. Three options: acknowledge the service is current, update it, or retire it. The Curator is one click away. The retirement form is one click away. The acknowledgement is one click — no login required.

1Bi-annual trigger fires

Microsoft 365 / Power Automate generates personalised emails for every active service owner — one email per service, with the current CQS tier and last review date visible in the email itself.

2Service Owner chooses their action

Acknowledge (service is current), Update (opens Curator bot or change request form), or Retire (opens retirement request form). All three paths are one click from the email.

3Non-responses are escalated automatically

After ten business days, a reminder is sent. After fifteen, the Service Owner's line manager is notified. After thirty, the service is flagged to the Product Owner. The catalog does not go stale silently.

4Cycle closes with a metrics report

The Product Owner receives a summary: services reviewed, CQS tier distribution, actions taken, non-responsive owners, net change in average score cycle-over-cycle.

The Bigger Picture

What we have built is not primarily a tool for improving catalog descriptions. It is a model for how organisations should think about AI-ready content governance at enterprise scale.

The Catalog Quality Curator demonstrates something that will become increasingly important as agentic AI proliferates across enterprise platforms: the bottleneck to AI performance is almost never the AI itself. It is the quality, consistency, and governance of the content the AI consumes. Every organisation deploying HelixGPT, Microsoft Copilot, ServiceNow AI, or any other enterprise AI assistant will eventually discover this. The ones that discover it before deployment — and invest in the content infrastructure first — will see dramatically better return on their AI investment than those who discover it after.

A service catalog that employees cannot find is a cost center. A service catalog that HelixGPT can accurately navigate is a strategic asset. The distance between those two things is not a technology gap. It is five fields, two verbs, and the discipline to govern them.

We used an AI agent to make our catalog