Internal Developer Platform Design Patterns
How to design and build an internal developer platform. Covers golden paths, self-service infrastructure, developer experience metrics, and platform team operating models.
Platform engineering is the discipline of building and maintaining the internal developer platform (IDP) that enables self-service infrastructure, automated compliance, and standardized deployment paths. Done right, it eliminates the ticket-driven bottleneck between development and operations. Done wrong, it creates a new bottleneck with extra steps.
The fundamental insight: developers don’t want infrastructure autonomy — they want infrastructure that works without thinking about it. The best platform is invisible. It handles provisioning, security, compliance, and deployment while developers focus entirely on business logic.
The Golden Path
A golden path is an opinionated, optimized route through the platform that handles 80% of use cases without any configuration. Teams that deviate from the golden path accept full operational responsibility for their choices.
Example Golden Path: New Microservice
1. Developer runs `platform create service --name order-api --lang go`
2. Platform generates:
├── Repository with CI/CD pipeline
├── Docker + Kubernetes manifests
├── Development, staging, production environments
├── Monitoring dashboards + alerting rules
├── Database (PostgreSQL) with backup configuration
├── Secret management (Vault integration)
├── Service mesh enrollment
└── Developer documentation skeleton
3. First deployment happens in < 15 minutes
4. Full production readiness in < 1 hour
The golden path reduces “time to first deployment” from 2-3 weeks to under an hour. This isn’t just an efficiency gain — it eliminates an entire class of configuration errors that plague manual setups.
Self-Service Infrastructure Components
1. Service Catalog
A curated menu of infrastructure components that developers can provision on-demand:
| Component | Self-Service? | Approval Required? |
|---|---|---|
| PostgreSQL database | ✅ | No |
| Redis cache | ✅ | No |
| Kafka topic | ✅ | No |
| Public API endpoint | ✅ | Security review |
| VPN connection | ❌ | Network team |
| Cross-account IAM | ❌ | Security team |
The rule: if a component can be provisioned safely with guardrails, make it self-service. If it can create security or cost risks, require lightweight approval.
2. Environment Management
Developers should provision and tear down environments without tickets:
# platform.yaml (in repo root)
environments:
dev:
replicas: 1
resources: small
database: shared-dev
auto-destroy: 7d
staging:
replicas: 2
resources: medium
database: dedicated
auto-destroy: 30d
production:
replicas: 3
resources: large
database: dedicated-ha
auto-destroy: never
Auto-destroy policies prevent environment sprawl. Development environments that haven’t been used in 7 days get automatically cleaned up, saving 30-50% on non-production costs.
3. Observability as a Platform Feature
Don’t ask developers to configure monitoring. Bake it into the platform:
- Automatic instrumentation: OpenTelemetry agent injected at deployment
- Standard dashboards: Generated from service metadata
- Baseline alerts: CPU, memory, error rate, latency P99 — configured by default
- Custom metrics: Self-service via annotations in code
Developer Experience Metrics
Platform success is measured by developer productivity, not infrastructure metrics.
| Metric | Target | How to Measure |
|---|---|---|
| Time to first deploy | < 1 hour | From git init to production traffic |
| Deploy frequency | Multiple per day per team | CI/CD pipeline observations |
| Lead time | < 1 day | Commit to production |
| Change failure rate | < 5% | Rollbacks / total deploys |
| Developer satisfaction | > 4/5 | Quarterly survey (NPS) |
| Self-service rate | > 90% | Requests fulfilled without tickets |
If the self-service rate is below 80%, your platform has gaps. Every ticket represents a missing capability or a UX problem in your platform.
Platform Team Operating Model
Team Size
Rule of thumb: 1 platform engineer per 10-15 application developers. A 100-person engineering org needs a 7-10 person platform team.
Treat the Platform as a Product
- Product manager: Prioritizes features based on developer needs
- User research: Regular feedback sessions with development teams
- SLAs: Platform availability and capability commitments
- Documentation: Comprehensive, up-to-date, with working examples
The Anti-Pattern: Platform as Gatekeeper
The platform team’s job is to enable, not control. If developers see the platform team as a bottleneck, you’ve failed. Signs of trouble:
- Developers circumvent the platform to get things done
- Most platform interactions are through tickets, not self-service
- Platform team is in the critical path for every deployment
Implementation Roadmap
- Month 1-2: Build the golden path for one language/framework
- Month 3-4: Add self-service database and cache provisioning
- Month 5-6: Implement automatic monitoring and alerting
- Month 7-9: Build the service catalog UI and developer portal
- Month 10-12: Add compliance automation and cost visibility
- Ongoing: Measure developer satisfaction and iterate
Start narrow. A golden path that works perfectly for one stack is infinitely more valuable than a platform that half-works for five stacks.