ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Self-Service Infrastructure: Empowering Developers Without Losing Control

Build self-service infrastructure provisioning that gives developers speed while maintaining security, compliance, and cost controls. Covers Terraform modules, Crossplane, guardrails, approval workflows, and the balance between autonomy and governance.

The platform team’s job is to get out of the developer’s way — without getting out of the way of security, compliance, and cost control. Self-service infrastructure achieves this by providing pre-approved, guardrailed pathways for common provisioning tasks. Developers get what they need in minutes instead of days. Platform teams maintain control without becoming a ticket-processing bottleneck.


The Self-Service Spectrum

Not everything should be self-service. The spectrum looks like this:

Full Self-Service (no approval needed):
  - Dev/staging environments
  - Feature branch databases
  - S3 buckets for development
  - CI/CD pipeline modifications

Guardrailed Self-Service (automated guardrails, no human approval):
  - Production microservice deployment
  - Database creation within size limits
  - DNS record creation
  - Load balancer configuration

Assisted Self-Service (requires approval):
  - Production database schema changes
  - Cross-account network peering
  - New AWS account creation
  - Resources exceeding cost thresholds

Manual Request (platform team executes):
  - Multi-region failover setup
  - Security group modifications
  - Compliance-sensitive infrastructure
  - Vendor integrations

The goal over time is to move items up this spectrum — converting manual requests into assisted, assisted into guardrailed, guardrailed into full self-service.


Implementation Patterns

Terraform Module Catalog

Publish curated, hardened Terraform modules that encode best practices:

# Developer writes this
module "api_service" {
  source  = "registry.internal/platform/api-service/aws"
  version = "3.2.0"

  name        = "checkout-api"
  team        = "commerce"
  environment = "production"
  
  # The module handles:
  # - ECS Fargate service with proper IAM roles
  # - ALB with TLS termination
  # - CloudWatch log group with retention policy
  # - Auto-scaling configuration
  # - Security groups with least-privilege
  # - Tags for cost allocation
  # - Monitoring and alerting
}

The developer specifies what they want. The module specifies how it is built — including all the security, monitoring, and compliance details they would otherwise forget.

Crossplane (Kubernetes-Native)

Crossplane lets developers provision infrastructure using Kubernetes resources:

apiVersion: database.platform.example.com/v1alpha1
kind: PostgresDatabase
metadata:
  name: checkout-db
  namespace: commerce
spec:
  size: small        # Predefined sizes: small, medium, large
  version: "15"
  backups: daily
  environment: staging

The platform team defines CompositeResourceDefinitions that map these simple specs to complex cloud resources with all required configurations.

Internal Developer Portal

A web UI that wraps the Terraform/Crossplane backend:

┌─────────────────────────────────────┐
│  Create New Service                 │
│                                     │
│  Service Name: [checkout-api    ]   │
│  Team:         [Commerce    ▼]      │
│  Environment:  [●Staging ○Production]│
│  Database:     [☑ PostgreSQL]       │
│  Cache:        [☑ Redis]            │
│  Queue:        [☐ SQS]             │
│                                     │
│  Estimated Cost: $142/month         │
│                                     │
│  [Create Service]                   │
└─────────────────────────────────────┘

Guardrails

Self-service without guardrails is a cost overrun and security incident waiting to happen.

Cost Guardrails

# Policy: No single resource can cost more than $500/month without VP approval
def check_cost_policy(resource):
    estimated_monthly = calculate_cost(resource)
    
    if estimated_monthly > 5000:
        return deny("Resources over $5000/month require VP approval")
    elif estimated_monthly > 500:
        return require_approval("manager", f"Estimated cost: ${estimated_monthly}/month")
    else:
        return allow()

Security Guardrails

# OPA policy: Databases must have encryption at rest
deny[msg] {
    input.resource_type == "aws_db_instance"
    not input.storage_encrypted
    msg := "Database must have encryption at rest enabled"
}

# S3 buckets must not be public
deny[msg] {
    input.resource_type == "aws_s3_bucket"
    input.acl == "public-read"
    msg := "S3 buckets must not be publicly accessible"
}

Resource Limits

# Per-team resource quotas
team: commerce
quotas:
  staging:
    max_instances: 10
    max_databases: 5
    max_monthly_spend: $2000
  production:
    max_instances: 20
    max_databases: 5
    max_monthly_spend: $10000

Environment Lifecycle

Self-service environments need automated lifecycle management:

Create:     Developer creates environment via portal
Active:     Environment running, costs accumulating
Warning:    14 days inactive → email owner
Hibernate:  21 days inactive → stop instances, keep data
Terminate:  30 days inactive → destroy everything

This prevents the “600 forgotten dev environments” problem.


Measuring Success

MetricTargetWhy
Time to first deployment< 30 minutesFrom deciding to build a service to having it running
Platform team ticket volumeDecreasing monthlySelf-service should reduce, not increase, requests
Developer satisfaction score> 4.0/5.0Survey developers quarterly
Cost per developer environment< $X/monthSelf-service should not mean unlimited spending
Security policy violations0Guardrails catch everything, not post-hoc audits

Anti-Patterns

Anti-PatternConsequenceFix
Self-service without guardrailsCost overruns, security gapsPolicy-as-code enforcement
Too many optionsDecision paralysis, inconsistencyOpinionated defaults, limited choices
No cleanup automationZombie environments accumulateAutomatic lifecycle policies
Platform team still requiredSelf-service in name onlyInvest in automation, not ticket automation
Ignoring developer feedbackLow adoption, shadow ITQuarterly developer surveys, usage analytics

Self-service infrastructure is not about giving developers root access to AWS. It is about encoding your organization’s best practices — security, cost, reliability — into reusable, discoverable, guardrailed components that developers can use without waiting for anyone.

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →