Service Catalog: Making Your Platform Discoverable
Build a service catalog that helps developers find, understand, and integrate with platform services. Covers catalog design, API documentation standards, ownership tracking, dependency mapping, and keeping the catalog from becoming another stale wiki.
A platform is only as useful as its discoverability. If developers cannot find your authentication service’s API docs, they will build their own authentication. If they cannot find your shared logging library, they will write their own. The service catalog is the index that prevents this duplication — a single source of truth for what exists, who owns it, and how to use it.
What Goes in a Service Catalog
Service Metadata
Every service entry should include:
name: order-service
description: "Manages order lifecycle from creation to fulfillment"
owner: backend-team
tier: critical # critical, standard, experimental
lifecycle: production # development, staging, production, deprecated
repository: https://github.com/org/order-service
documentation: https://docs.internal/order-service
api_spec: https://api.internal/order-service/openapi.yaml
dashboards:
- https://grafana.internal/d/order-service
- https://grafana.internal/d/order-service-business
oncall: https://pagerduty.com/schedules/order-team
slack: "#order-service"
dependencies:
- payment-service
- inventory-service
- postgres-primary
- redis-shared
consumers:
- api-gateway
- checkout-frontend
- reporting-service
API Documentation
Every service with an API publishes a machine-readable specification:
# OpenAPI 3.0
openapi: 3.0.0
info:
title: Order Service
version: 2.4.1
description: |
Manages the full order lifecycle. All endpoints require
a valid JWT in the Authorization header.
paths:
/api/orders:
post:
summary: Create a new order
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/CreateOrderRequest'
responses:
'201':
description: Order created
'400':
description: Invalid request
'402':
description: Payment failed
Runbook Links
For every service, link to:
- How to deploy it
- How to restart it
- Common failure modes and remediation
- Escalation contacts
Catalog Architecture
Backstage (Spotify’s Open Source Platform)
Backstage is the most popular service catalog for platform teams:
# catalog-info.yaml (lives in the service's repo)
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: order-service
description: Manages order lifecycle
annotations:
github.com/project-slug: org/order-service
pagerduty.com/service-id: P1234AB
grafana/dashboard-selector: order-service
spec:
type: service
lifecycle: production
owner: backend-team
providesApis:
- order-api
consumesApis:
- payment-api
- inventory-api
Backstage reads catalog-info.yaml from each repository and assembles the global catalog automatically. No manual data entry.
Custom Solutions
For smaller organizations, a static site or database-backed internal tool:
catalog/
├── services/
│ ├── order-service.yaml
│ ├── payment-service.yaml
│ └── inventory-service.yaml
├── libraries/
│ ├── auth-sdk.yaml
│ └── logging-lib.yaml
└── infrastructure/
├── postgres-primary.yaml
└── redis-shared.yaml
A CI job validates the YAML schema and publishes to an internal site on every merge.
Keeping the Catalog Fresh
The number one failure mode of service catalogs is staleness. The catalog launches with accurate data, then slowly diverges from reality as services are created, modified, and retired without updating the catalog.
Automated Validation
def validate_catalog_freshness():
"""CI job that runs weekly"""
catalog = load_catalog()
for service in catalog.services:
# Check repository exists
if not github.repo_exists(service.repository):
report_stale(service.name, "repository not found")
# Check API spec is accessible
if service.api_spec:
response = requests.get(service.api_spec)
if response.status_code != 200:
report_stale(service.name, "API spec URL broken")
# Check owner team exists in org chart
if not org.team_exists(service.owner):
report_stale(service.name, "owner team not found")
# Check dashboard links
for dashboard in service.dashboards:
if not grafana.dashboard_exists(dashboard):
report_stale(service.name, f"dashboard broken: {dashboard}")
Ownership Enforcement
- Every PR that creates a new service must include a
catalog-info.yaml - Quarterly ownership review: each team confirms their service list
- Unowned services get escalated to engineering leadership
Dependency Mapping
The catalog enables dependency visualization:
api-gateway
├── order-service
│ ├── payment-service
│ │ └── stripe-api (external)
│ ├── inventory-service
│ │ └── postgres-primary
│ └── redis-shared
├── user-service
│ └── postgres-users
└── notification-service
├── sendgrid-api (external)
└── redis-shared
This map answers critical questions:
- “If postgres-primary goes down, which services are affected?” → order-service, inventory-service
- “Which services depend on external APIs?” → payment-service (Stripe), notification-service (SendGrid)
- “What is the blast radius of deploying a new redis version?” → order-service, notification-service
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Manual catalog maintenance | Data goes stale within months | Automate from source repos |
| No API specifications | Developers guess at interfaces | Require OpenAPI spec in catalog entry |
| No ownership tracking | Services become orphans | Quarterly ownership review |
| Catalog without search | Nobody can find anything | Full-text search + category filtering |
| Separate from development workflow | Extra step nobody does | catalog-info.yaml in the repo, validated in CI |
A service catalog is not a documentation project — it is a platform feature. It reduces the time from “I need to integrate with X” to “I know exactly how to call X and who to ask for help” from days to minutes.