ESC
Type to search guides, tutorials, and reference documentation.
Verified by Garnet Grid

Kubernetes Cost Optimization

Reduce Kubernetes costs without sacrificing reliability. Covers resource right-sizing, spot instances, cluster autoscaling, namespace quotas, cost allocation, and workload scheduling.

Kubernetes makes it easy to over-provision. Every team requests 4 CPUs and 8 GB RAM “just in case,” but actual usage averages 0.3 CPUs and 500 MB. Across 200 services, that waste adds up to hundreds of thousands of dollars per year. Kubernetes cost optimization is about matching resources to actual usage without causing outages.


Where Kubernetes Money Goes

Cost ComponentTypical %Optimization Lever
Compute (nodes)60-70%Right-size pods, use spot/preemptible, autoscale
Storage (PVCs)10-15%Right-size volumes, lifecycle policies
Networking (LB, NAT)10-15%Reduce cross-AZ traffic, internal LBs
Control plane5-10%Managed K8s pricing varies by provider

Resource Right-Sizing

# BEFORE: Over-provisioned (common default)
resources:
  requests:
    cpu: "2000m"     # Requesting 2 full CPUs
    memory: "4Gi"    # Requesting 4 GB
  limits:
    cpu: "4000m"
    memory: "8Gi"

# AFTER: Right-sized based on actual usage data
resources:
  requests:
    cpu: "250m"      # Actual p95 usage: 200m
    memory: "512Mi"  # Actual p95 usage: 400Mi
  limits:
    cpu: "500m"      # 2x headroom for bursts
    memory: "1Gi"    # 2x headroom for spikes

Finding Right Sizes

# Check actual CPU/memory usage vs requests
kubectl top pods -n production --sort-by=cpu

# Use VPA recommendations
kubectl get vpa -n production -o yaml
# Look for: recommendation.containerRecommendations

Spot/Preemptible Instances

Workload TypeSpot Safe?Strategy
Stateless web services✅ YesMultiple replicas, PodDisruptionBudget
Batch/ML training✅ YesCheckpointing, retry on preemption
Stateful databases❌ NoOn-demand instances only
CI/CD runners✅ YesRe-run jobs on preemption
Cron jobs✅ YesRetry policy
# Node pool with spot instances (GKE)
apiVersion: container.v1.gke.io/v1
kind: NodePool
spec:
  config:
    spot: true
    machineType: e2-standard-4
  autoscaling:
    minNodeCount: 0
    maxNodeCount: 20
  management:
    autoRepair: true

Cluster Autoscaler Settings

# Aggressive scale-down for cost savings
cluster-autoscaler:
  scale-down-enabled: true
  scale-down-delay-after-add: 5m      # Wait 5 min after scaling up
  scale-down-unneeded-time: 5m        # Node unused for 5 min → remove
  scale-down-utilization-threshold: 0.5 # < 50% utilized → candidate
  
  # Prevent thrashing
  max-node-provision-time: 15m
  skip-nodes-with-local-storage: false
  skip-nodes-with-system-pods: true

Anti-Patterns

Anti-PatternProblemFix
No resource requestsScheduler can’t bin-pack, nodes underutilizedSet requests on every container
Requests = limits (always)No burst capacity, need more nodesRequests at p95 usage, limits at 2x
All on-demand nodesPaying full price for interruptible workloadsSpot nodes for stateless workloads
No namespace quotasOne team consumes all resourcesResourceQuotas per namespace
No cost visibilityNobody knows which team spends whatCost allocation labels on all resources

Checklist

  • Resource requests set on every pod (based on actual usage)
  • Limits set at 2x requests for burst headroom
  • VPA deployed for right-sizing recommendations
  • Spot/preemptible nodes for stateless workloads (40-60% savings)
  • Cluster autoscaler configured with aggressive scale-down
  • Namespace quotas for resource governance
  • Cost allocation labels: team, service, environment
  • Cost monitoring dashboard (Kubecost, OpenCost)
  • Monthly cost review with team owners

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For Kubernetes cost optimization, visit garnetgrid.com. :::

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →