Cost Optimization and Overprovisioning Analysis

infoCost Optimization

Identify wasted spend from overprovisioned resource requests that leave nodes underutilized despite appearing full.

Prompt: “Our Kubernetes cloud bill keeps growing and I suspect we're overprovisioning — can you help me analyze our resource requests vs actual usage to find where we're wasting money on unused CPU and memory?”

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When investigating Kubernetes overprovisioning and cost waste, start by measuring overall cluster efficiency to understand the scale of the problem, then identify abandoned resources for quick wins before diving into container-level request-to-usage analysis. Finally, check for resource reservation issues that prevent efficient bin-packing and drive unnecessary cluster scaling.

1Measure overall cluster efficiency

Calculate cluster efficiency by comparing `kubernetes_cpu_usage` and `kubernetes_memory_usage` against `kubernetes_cpu_requested` and `kubernetes_memory_requested` across all pods. If the usage-to-request ratio is below 50% (as flagged by `kubecost-cluster-efficiency-below-threshold`), you have significant overprovisioning driving up costs. This gives you the baseline to quantify total waste and prioritize where to focus next.

Kubecost Cluster Efficiency Below Threshold kubernetes_cpu_usagekubernetes_cpu_requestedkubernetes_memory_usagekubernetes_memory_requested

2Find abandoned and idle resources for quick wins

Check for deployments with zero replicas, unattached PVCs, and idle LoadBalancer services—these represent pure waste that can be eliminated immediately. As noted in `kubecost-idle-resource-cost-accumulation`, these typically account for 20-30% of cloud spend and require no complex analysis, just cleanup. This is the fastest way to reduce your bill while you investigate the deeper overprovisioning issues.

Kubecost Idle Resource Cost Accumulation

3Identify containers with the largest request-to-usage gaps

Compare `kubernetes_cpu_usage` to `kubernetes_cpu_requested` and `kubernetes_memory_usage` to `kubernetes_memory_requested` at the container level to find the worst offenders. Containers using less than 30% of their requested resources (highlighted by `kubecost-over-requested-container-waste`) are prime candidates for right-sizing. Focus on high-request containers first since reducing a 4-core request to 1 core saves more than optimizing a 0.5-core pod.

Kubecost Over-Requested Container Waste kubernetes_cpu_usagekubernetes_cpu_requestedkubernetes_memory_usagekubernetes_memory_requested

4Check for sustained pod-level underutilization patterns

Monitor pod-level `kubernetes_cpu_usage` and `kubernetes_memory_usage` over time, not just point-in-time snapshots. If pods consistently use less than 30-40% of their requested resources for days or weeks, they're chronically over-provisioned (as described in `pod-cpu-and-memory-underutilization-driving-cost-waste`). Set requests to match P75 actual usage rather than peak or theoretical maximums to avoid wasting node capacity.

Pod CPU and Memory Underutilization Driving Cost Waste kubernetes_cpu_usagekubernetes_memory_usagekubernetes_cpu_requestedkubernetes_memory_requested

5Identify resource reservation blocking efficient scheduling

Look for nodes where `kubernetes_cpu_requested` and `kubernetes_memory_requested` exceed 80% of `kubernetes_cpu_capacity` and `kubernetes_memory_capacity`, but actual `kubernetes_cpu_usage` and `kubernetes_memory_usage` remain below 50%. This pattern (flagged by `resource-reservation-preventing-node-utilization`) causes new pods to fail scheduling despite plenty of actual capacity, forcing unnecessary cluster scaling and driving up costs. Right-sizing the over-requested pods will unlock this stranded capacity.

Resource Reservation Preventing Node Utilization kubernetes_cpu_requestedkubernetes_memory_requestedkubernetes_cpu_capacitykubernetes_memory_capacitykubernetes_cpu_usagekubernetes_memory_usage