FinOps for Kubernetes | The Garnet Wiki

Kubernetes makes it trivially easy to consume cloud resources — and trivially easy to waste them. Without FinOps practices, teams request 4 CPUs and 8 GB RAM “just in case,” clusters auto-scale to handle a load spike and never scale back down, and nobody knows which team’s pods are costing the most. Kubernetes FinOps turns cluster costs from a black box into an optimized, accountable system.

Right-Sizing Pods

# Before right-sizing (common waste pattern):
resources:
  requests:
    cpu: "2000m"     # Requested 2 CPU
    memory: "4Gi"    # Requested 4 GB
  limits:
    cpu: "4000m"     # Limit 4 CPU
    memory: "8Gi"    # Limit 8 GB
    
# Actual usage (discovered via monitoring):
#   CPU: avg 200m, p99 800m
#   Memory: avg 512Mi, p99 1.2Gi

# After right-sizing:
resources:
  requests:
    cpu: "400m"      # 2x p99 usage (headroom)
    memory: "1500Mi" # ~1.25x p99 usage
  limits:
    cpu: "1000m"     # 1.25x p99 (burst capacity)
    memory: "2Gi"    # Hard cap

# Savings: 80% CPU reduction, 75% memory reduction
# Per pod: $0.15/hour → $0.04/hour
# 50 pods: $7.50/hour → $2.00/hour
# Annual: $65,700 → $17,520 = $48,180 saved

Cost Allocation

class KubernetesCostAllocator:
    """Allocate cluster costs to namespaces and teams."""
    
    def calculate_namespace_costs(self, cluster: str):
        """Calculate cost per namespace based on resource usage."""
        
        cluster_cost = self.get_cluster_hourly_cost(cluster)
        total_capacity = self.get_cluster_capacity(cluster)
        
        namespace_costs = {}
        for ns in self.get_namespaces(cluster):
            usage = self.get_namespace_usage(ns)
            
            # Proportional cost based on resource usage
            cpu_fraction = usage.cpu / total_capacity.cpu
            mem_fraction = usage.memory / total_capacity.memory
            
            # Weighted average (CPU typically more expensive)
            cost_fraction = (cpu_fraction * 0.6) + (mem_fraction * 0.4)
            
            namespace_costs[ns.name] = {
                "hourly_cost": cluster_cost * cost_fraction,
                "monthly_cost": cluster_cost * cost_fraction * 730,
                "cpu_requested": usage.cpu,
                "cpu_used": usage.cpu_actual,
                "cpu_efficiency": usage.cpu_actual / usage.cpu * 100,
                "memory_requested": usage.memory,
                "memory_used": usage.memory_actual,
                "memory_efficiency": usage.memory_actual / usage.memory * 100,
                "owner": ns.labels.get("team", "unknown"),
            }
        
        return namespace_costs

Anti-Patterns

Anti-Pattern	Consequence	Fix
No resource requests/limits	Noisy neighbors, unpredictable scheduling	Mandatory requests and limits via admission webhook
Over-provisioned requests	Wasted capacity, cluster scales unnecessarily	Right-size based on actual usage metrics
No Cluster Autoscaler	Fixed cluster size wastes money off-peak	Cluster Autoscaler or Karpenter
All on-demand nodes	Miss 60-90% savings on workloads that tolerate interrupts	Spot/preemptible for stateless workloads
No cost visibility per namespace	Nobody owns the cluster cost	Cost allocation by namespace + team labels

Kubernetes FinOps is about matching resource allocation to actual usage. The gap between what teams request and what they use is pure waste — and in most organizations, that gap is 60-80%. Right-sizing, spot instances, and cost visibility close that gap.

Right-Sizing Pods

Cost Allocation

Anti-Patterns

More in FinOps

Cloud Cost Anomaly Detection Systems

Cloud Billing Optimization

Cloud Cost Allocation and Showback: Making Teams Own Their Spend