ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Internal Developer Platforms: Building the Self-Service Layer Your Engineers Actually Want

Design and build an Internal Developer Platform (IDP) that eliminates toil. Covers golden paths, self-service infrastructure, developer portals, and the organizational patterns that make platforms succeed.

Platform engineering exists because DevOps made a promise it could not keep: that every developer would become an infrastructure expert. They will not. They should not have to. Your frontend engineer should not need to understand Kubernetes networking to deploy a React application, and your data scientist should not need to write Terraform to spin up a GPU instance.

An Internal Developer Platform (IDP) is the self-service layer that abstracts infrastructure complexity behind simple, opinionated interfaces. Done right, it gives developers autonomy without chaos. Done wrong, it becomes another layer of bureaucracy that nobody uses, built by a team that nobody asked for.

This guide covers how to build one that people actually use.


The Platform Maturity Model

Most organizations think they need a platform. Most organizations are wrong. Here is how to know:

Level 0: Wild West
  Every team provisions their own infrastructure.
  10 teams = 10 different ways to deploy.
  "Works on my machine" is a deployment strategy.

Level 1: Shared Scripts
  A platform team maintains shared CI/CD pipelines.
  Teams can deploy, but configuration is tribal knowledge.
  The "platform" is a Confluence page nobody reads.

Level 2: Golden Paths
  Opinionated templates for common workloads.
  Self-service for 80% of use cases.
  Escape hatches for the other 20%.

Level 3: Full IDP
  Developer portal with service catalog.
  Self-service databases, queues, storage.
  Infrastructure as product with SLOs.

Level 4: Platform as Competitive Advantage
  Platform enables capabilities competitors cannot match.
  New services launch in hours, not weeks.
  Developer satisfaction is a tracked metric.
LevelTeam SizeInfra EngineersDeveloper Wait Time
0Any0 (everyone does it)Variable (minutes to weeks)
120-501-3Days
250-2003-8Hours
3200-10008-20Minutes
41000+20+ (but ratio improves)Self-service

The most important insight: Start at Level 2. Do not try to build a Level 3 platform with a Level 0 organization. The golden path approach delivers 80% of the value with 20% of the effort.


Golden Paths: The Heart of Platform Engineering

A golden path is an opinionated, pre-configured way to accomplish a common task. It is not a mandate — developers can go off-path — but the golden path should be so good that going off-path feels like unnecessary work.

Example Golden Path: Deploy a New Microservice

# service-template.yaml — everything a developer needs to provide
apiVersion: platform.garnet.io/v1
kind: ServiceTemplate
metadata:
  name: my-new-service
spec:
  # Developer provides these 4 fields. Platform handles everything else.
  name: payment-validator
  team: payments
  language: python
  tier: critical  # critical | standard | experimental

  # Everything below is set by the template defaults:
  # ✅ Kubernetes deployment + service + ingress
  # ✅ CI/CD pipeline (build, test, scan, deploy)
  # ✅ Monitoring dashboards (RED metrics)
  # ✅ Alerting rules (error rate, latency, saturation)
  # ✅ Log aggregation pipeline
  # ✅ mTLS certificates
  # ✅ Resource limits and autoscaling
  # ✅ Network policies

What happens when a developer submits this template:

Developer runs: platform create service --from template.yaml

Platform does:
  1. Creates GitHub repository from language-specific scaffold
  2. Provisions Kubernetes namespace with resource quotas
  3. Generates CI/CD pipeline (GitHub Actions / GitLab CI)
  4. Creates Datadog/Grafana dashboards
  5. Configures PagerDuty escalation based on tier
  6. Registers service in service catalog (Backstage)
  7. Generates initial README with runbook template
  8. Outputs: "Your service is running at https://payment-validator.internal"

Total time: < 5 minutes
Previous time: 2-3 days of tickets and Slack messages

What Makes a Golden Path Successful

PropertyGood Golden PathBad Golden Path
OpinionatedMakes decisions for youExposes 100 config options
Escape hatchesAllows overrides when neededLocks you in completely
DocumentationSelf-documenting via templatesRequires reading a wiki
MaintenanceUpdated by platform teamAbandoned after initial launch
Feedback loopUsers report friction, team iterates”We built it, they will come”

Developer Portal: The Service Catalog

A developer portal (typically built with Backstage or similar) is the single pane of glass for your entire engineering organization.

What Belongs in the Portal

┌──────────────────────────────────────────────────────────┐
│                   DEVELOPER PORTAL                        │
│                                                           │
│  ┌─────────────┐  ┌─────────────┐  ┌────────────────┐   │
│  │ Service     │  │ API         │  │ Documentation  │   │
│  │ Catalog     │  │ Registry    │  │ Hub            │   │
│  │             │  │             │  │                │   │
│  │ - Owner     │  │ - Endpoints │  │ - Runbooks     │   │
│  │ - SLOs      │  │ - Schemas   │  │ - ADRs         │   │
│  │ - Deps      │  │ - Versions  │  │ - Tutorials    │   │
│  │ - Health    │  │ - Auth      │  │ - On-call      │   │
│  └─────────────┘  └─────────────┘  └────────────────┘   │
│                                                           │
│  ┌─────────────┐  ┌─────────────┐  ┌────────────────┐   │
│  │ Templates   │  │ Scorecards  │  │ Cost Dashboard │   │
│  │ (Golden     │  │ (Production │  │ (Per-team      │   │
│  │  Paths)     │  │  Readiness) │  │  cloud spend)  │   │
│  └─────────────┘  └─────────────┘  └────────────────┘   │
│                                                           │
└──────────────────────────────────────────────────────────┘

Backstage Catalog Descriptor

# catalog-info.yaml — lives in every service repo
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: payment-validator
  description: Validates payment transactions against fraud rules
  annotations:
    github.com/project-slug: garnet/payment-validator
    pagerduty.com/service-id: P123ABC
    grafana/dashboard-selector: "app=payment-validator"
  tags:
    - python
    - payments
    - critical
  links:
    - url: https://grafana.internal/d/payment-validator
      title: Grafana Dashboard
    - url: https://runbooks.internal/payment-validator
      title: Runbook
spec:
  type: service
  lifecycle: production
  owner: team-payments
  system: payment-processing
  dependsOn:
    - component:fraud-engine
    - resource:postgres-payments
  providesApis:
    - payment-validation-api

Self-Service Infrastructure

The most impactful self-service capabilities, in order of developer demand:

Tier 1: Build These First

CapabilityDeveloper ExperiencePlatform Implementation
Deploy a serviceplatform deployK8s + Helm + ArgoCD
Create a databaseplatform db create --type postgresTerraform + Cloud SQL/RDS
Add a secretplatform secret set KEY=valueVault / AWS Secrets Manager
View logsPortal link → Grafana/LokiCentralized log pipeline
Check service healthPortal → dashboardPrometheus + Grafana

Tier 2: Build These Next

CapabilityDeveloper ExperiencePlatform Implementation
Create a message queueplatform queue create --type kafkaTerraform + managed Kafka
Spin up preview environmentsAutomatic on PRNamespace-per-PR + teardown
Run load testsplatform loadtest --rps 1000k6/Locust + shared infra
Create a cron jobplatform cron create --schedule "0 * * *"K8s CronJob template

Measuring Platform Success

A platform that nobody measures is a platform that nobody improves.

Key Metrics

MetricWhat It MeasuresTarget
Time to first deployHow long from “I need a new service” to running in production< 1 hour
Deployment frequencyHow often teams deploy per day> 1/day
Golden path adoption% of services using platform templates> 80%
Developer satisfactionQuarterly survey (NPS for platform)> 40 NPS
Ticket volumeRequests to platform team per weekDecreasing trend
Self-service ratio% of infra requests handled without platform team> 70%

The Anti-Metrics (What to Watch For)

Anti-PatternSignalRoot Cause
Shadow ITTeams building their own deployment pipelinesGolden paths do not cover their use case
Platform avoidanceTeams requesting exceptions to skip the platformPlatform adds friction instead of removing it
Ticket queue growthPlatform team becomes a bottleneckNot enough self-service automation
Feature creepPlatform supports every edge caseNo discipline about what is in scope

Implementation Checklist

  • Survey developers: What are the top 5 things that waste your time? (Build for those first)
  • Define 3-5 golden paths for your most common workloads
  • Build a service template that deploys a working service in < 5 minutes
  • Set up a service catalog (Backstage or equivalent) and register all existing services
  • Implement self-service for databases and secrets (highest demand)
  • Establish platform SLOs: “95% of self-service requests complete in < 10 minutes”
  • Run quarterly developer satisfaction surveys and publish results
  • Resist the urge to build for edge cases — 80/20 rule applies
Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →