FinOps for Kubernetes
Optimize Kubernetes costs with resource right-sizing, cluster autoscaling, workload placement, and cost visibility. Covers resource requests vs limits, spot instances for K8s, namespace cost allocation, and the patterns that prevent Kubernetes from becoming a money pit.
Kubernetes is a cost amplifier. If your workloads are over-provisioned on VMs, they are over-provisioned inside containers too — but now you also pay for the Kubernetes control plane, load balancers, and persistent volumes. Without FinOps practices, Kubernetes costs spiral silently because nobody owns the bill.
Where Kubernetes Money Goes
Typical Kubernetes Cost Breakdown:
Compute (nodes): 65-75%
Storage (PV/EBS): 10-15%
Networking (LB, NAT): 5-10%
Control plane: 2-5%
Other (monitoring, logs): 5-10%
Hidden costs:
☐ Over-provisioned resource requests (CPU/memory reserved but unused)
☐ Idle nodes (cluster autoscaler min > actual need)
☐ Orphaned PersistentVolumes (pod deleted, volume remains)
☐ Cross-AZ network traffic (nodes in different AZs)
☐ Unused LoadBalancers ($15/month each on AWS)
Resource Right-Sizing
# BEFORE: Over-provisioned (common default)
resources:
requests:
cpu: "1000m" # Requesting 1 full CPU
memory: "2Gi" # Requesting 2 GB RAM
limits:
cpu: "2000m"
memory: "4Gi"
# Actual usage: 50m CPU, 256Mi memory
# Waste: 950m CPU (95%), 1.75Gi memory (87.5%)
# AFTER: Right-sized based on actual usage
resources:
requests:
cpu: "100m" # 2x actual peak (headroom)
memory: "512Mi" # 2x actual peak
limits:
cpu: "500m" # Burst capacity
memory: "1Gi" # OOM protection
# Tools for right-sizing:
# - VPA (Vertical Pod Autoscaler): Recommends resource values
# - Kubecost: Shows actual vs requested usage
# - Goldilocks: VPA recommendations per deployment
Cluster Autoscaler Optimization
# Cluster Autoscaler: Scale nodes based on pending pods
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-autoscaler-config
data:
# Scale down faster (default 10 min is too slow)
scale-down-delay-after-add: "5m"
scale-down-unneeded-time: "5m"
# Don't scale down if utilization above 50%
scale-down-utilization-threshold: "0.5"
# Use mixed instance types for cost optimization
expander: "least-waste" # Choose node that wastes least resources
# Node groups with mixed instance types
node_groups:
general:
instance_types:
- m5.xlarge # On-demand: $0.192/hr
- m5a.xlarge # AMD variant: $0.172/hr (10% cheaper)
- m6i.xlarge # Newer gen: $0.192/hr (better perf/$)
min_size: 2
max_size: 20
spot:
instance_types:
- m5.xlarge
- m5.2xlarge
- m5a.xlarge
- c5.xlarge # Mix families for spot availability
capacity_type: SPOT # 60-90% discount
min_size: 0
max_size: 30
Cost Allocation by Namespace
# Kubecost-style namespace cost attribution
class K8sCostAllocator:
def allocate_costs(self, period="month"):
"""Calculate cost per namespace based on actual resource usage."""
namespaces = self.get_all_namespaces()
node_costs = self.get_node_costs(period)
allocation = {}
for ns in namespaces:
pods = self.get_pods(namespace=ns)
cpu_hours = sum(p.cpu_usage_hours for p in pods)
memory_gb_hours = sum(p.memory_usage_gb_hours for p in pods)
storage_gb = sum(p.pv_size_gb for p in pods)
allocation[ns] = {
"compute": (cpu_hours * node_costs.cost_per_cpu_hour +
memory_gb_hours * node_costs.cost_per_gb_hour),
"storage": storage_gb * node_costs.cost_per_gb_month,
"network": self.get_namespace_egress_cost(ns),
"total": None, # Calculated below
"efficiency": cpu_hours / sum(p.cpu_requested_hours for p in pods),
}
allocation[ns]["total"] = sum(v for k, v in allocation[ns].items()
if k not in ("total", "efficiency"))
return allocation
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| Default resource requests | 10x over-provisioning | VPA recommendations, monthly right-sizing |
| No HPA (Horizontal Pod Autoscaler) | Fixed pod count wastes resources | HPA on CPU/custom metrics |
| On-demand only | Pay full price for fault-tolerant workloads | Spot/preemptible for stateless workloads |
| No namespace cost visibility | Teams cannot see their spend | Kubecost or OpenCost per namespace |
| Orphaned resources | PVs, LBs, and nodes that serve nothing | Monthly resource audit, automated cleanup |
Kubernetes does not have a cost problem — it has a visibility problem. When teams can see what their workloads cost, they optimize naturally. Give them the data.