FinOps for Kubernetes
Optimize Kubernetes cluster costs without sacrificing reliability. Covers right-sizing pods, cluster autoscaling, spot instances, namespace cost allocation, and the patterns that prevent Kubernetes from becoming the most expensive way to waste cloud resources.
Kubernetes makes it trivially easy to consume cloud resources — and trivially easy to waste them. Without FinOps practices, teams request 4 CPUs and 8 GB RAM “just in case,” clusters auto-scale to handle a load spike and never scale back down, and nobody knows which team’s pods are costing the most. Kubernetes FinOps turns cluster costs from a black box into an optimized, accountable system.
Right-Sizing Pods
# Before right-sizing (common waste pattern):
resources:
requests:
cpu: "2000m" # Requested 2 CPU
memory: "4Gi" # Requested 4 GB
limits:
cpu: "4000m" # Limit 4 CPU
memory: "8Gi" # Limit 8 GB
# Actual usage (discovered via monitoring):
# CPU: avg 200m, p99 800m
# Memory: avg 512Mi, p99 1.2Gi
# After right-sizing:
resources:
requests:
cpu: "400m" # 2x p99 usage (headroom)
memory: "1500Mi" # ~1.25x p99 usage
limits:
cpu: "1000m" # 1.25x p99 (burst capacity)
memory: "2Gi" # Hard cap
# Savings: 80% CPU reduction, 75% memory reduction
# Per pod: $0.15/hour → $0.04/hour
# 50 pods: $7.50/hour → $2.00/hour
# Annual: $65,700 → $17,520 = $48,180 saved
Cost Allocation
class KubernetesCostAllocator:
"""Allocate cluster costs to namespaces and teams."""
def calculate_namespace_costs(self, cluster: str):
"""Calculate cost per namespace based on resource usage."""
cluster_cost = self.get_cluster_hourly_cost(cluster)
total_capacity = self.get_cluster_capacity(cluster)
namespace_costs = {}
for ns in self.get_namespaces(cluster):
usage = self.get_namespace_usage(ns)
# Proportional cost based on resource usage
cpu_fraction = usage.cpu / total_capacity.cpu
mem_fraction = usage.memory / total_capacity.memory
# Weighted average (CPU typically more expensive)
cost_fraction = (cpu_fraction * 0.6) + (mem_fraction * 0.4)
namespace_costs[ns.name] = {
"hourly_cost": cluster_cost * cost_fraction,
"monthly_cost": cluster_cost * cost_fraction * 730,
"cpu_requested": usage.cpu,
"cpu_used": usage.cpu_actual,
"cpu_efficiency": usage.cpu_actual / usage.cpu * 100,
"memory_requested": usage.memory,
"memory_used": usage.memory_actual,
"memory_efficiency": usage.memory_actual / usage.memory * 100,
"owner": ns.labels.get("team", "unknown"),
}
return namespace_costs
Anti-Patterns
| Anti-Pattern | Consequence | Fix |
|---|---|---|
| No resource requests/limits | Noisy neighbors, unpredictable scheduling | Mandatory requests and limits via admission webhook |
| Over-provisioned requests | Wasted capacity, cluster scales unnecessarily | Right-size based on actual usage metrics |
| No Cluster Autoscaler | Fixed cluster size wastes money off-peak | Cluster Autoscaler or Karpenter |
| All on-demand nodes | Miss 60-90% savings on workloads that tolerate interrupts | Spot/preemptible for stateless workloads |
| No cost visibility per namespace | Nobody owns the cluster cost | Cost allocation by namespace + team labels |
Kubernetes FinOps is about matching resource allocation to actual usage. The gap between what teams request and what they use is pure waste — and in most organizations, that gap is 60-80%. Right-sizing, spot instances, and cost visibility close that gap.