Cost Optimization and Overprovisioning Analysis
infoCost Optimization
Identify wasted spend from overprovisioned resource requests that leave nodes underutilized despite appearing full.
Prompt: “Our Kubernetes cloud bill keeps growing and I suspect we're overprovisioning — can you help me analyze our resource requests vs actual usage to find where we're wasting money on unused CPU and memory?”
Agent Playbook
When an agent encounters this scenario, Schema provides these diagnostic steps automatically.
When investigating Kubernetes overprovisioning and cost waste, start by measuring overall cluster efficiency to understand the scale of the problem, then identify abandoned resources for quick wins before diving into container-level request-to-usage analysis. Finally, check for resource reservation issues that prevent efficient bin-packing and drive unnecessary cluster scaling.
1Measure overall cluster efficiency
Calculate cluster efficiency by comparing `kubernetes_cpu_usage` and `kubernetes_memory_usage` against `kubernetes_cpu_requested` and `kubernetes_memory_requested` across all pods. If the usage-to-request ratio is below 50% (as flagged by `kubecost-cluster-efficiency-below-threshold`), you have significant overprovisioning driving up costs. This gives you the baseline to quantify total waste and prioritize where to focus next.
2Find abandoned and idle resources for quick wins
Check for deployments with zero replicas, unattached PVCs, and idle LoadBalancer services—these represent pure waste that can be eliminated immediately. As noted in `kubecost-idle-resource-cost-accumulation`, these typically account for 20-30% of cloud spend and require no complex analysis, just cleanup. This is the fastest way to reduce your bill while you investigate the deeper overprovisioning issues.
3Identify containers with the largest request-to-usage gaps
Compare `kubernetes_cpu_usage` to `kubernetes_cpu_requested` and `kubernetes_memory_usage` to `kubernetes_memory_requested` at the container level to find the worst offenders. Containers using less than 30% of their requested resources (highlighted by `kubecost-over-requested-container-waste`) are prime candidates for right-sizing. Focus on high-request containers first since reducing a 4-core request to 1 core saves more than optimizing a 0.5-core pod.
4Check for sustained pod-level underutilization patterns
Monitor pod-level `kubernetes_cpu_usage` and `kubernetes_memory_usage` over time, not just point-in-time snapshots. If pods consistently use less than 30-40% of their requested resources for days or weeks, they're chronically over-provisioned (as described in `pod-cpu-and-memory-underutilization-driving-cost-waste`). Set requests to match P75 actual usage rather than peak or theoretical maximums to avoid wasting node capacity.
5Identify resource reservation blocking efficient scheduling
Look for nodes where `kubernetes_cpu_requested` and `kubernetes_memory_requested` exceed 80% of `kubernetes_cpu_capacity` and `kubernetes_memory_capacity`, but actual `kubernetes_cpu_usage` and `kubernetes_memory_usage` remain below 50%. This pattern (flagged by `resource-reservation-preventing-node-utilization`) causes new pods to fail scheduling despite plenty of actual capacity, forcing unnecessary cluster scaling and driving up costs. Right-sizing the over-requested pods will unlock this stranded capacity.
Technologies
Related Insights
Kubecost Cluster Efficiency Below Threshold
warning
When cluster efficiency ratio drops below 50%, significant capacity is wasted due to overprovisioning, poor bin-packing, or resource fragmentation, driving up infrastructure costs.
Kubecost Idle Resource Cost Accumulation
warning
Unused Kubernetes clusters, abandoned workloads with zero replicas, unattached volumes, and idle load balancers continue to accumulate costs even when not serving traffic, representing 20-30% typical waste.
Kubecost Over-Requested Container Waste
warning
Containers requesting significantly more CPU or memory than they actually use lead to node overprovisioning and wasted cloud spend. Kubecost identifies these inefficiencies through usage vs. request analysis.
Pod CPU and Memory Underutilization Driving Cost Waste
info
Consistently low CPU utilization and memory usage in pods indicates over-provisioned resource requests, leading to wasted node capacity and unnecessary infrastructure costs that can be optimized through right-sizing.
Resource Reservation Preventing Node Utilization
warning
Kubernetes nodes show low actual resource usage but cannot schedule new pods because resource requests have consumed most allocatable capacity. This causes cluster scaling when actual utilization is low, leading to wasted capacity and unnecessary costs.
Inefficient CPU-to-memory ratio degrades vector workload performance
warning