Kubernetes

Resource Reservation Preventing Node Utilization

warning
cost_managementUpdated Jan 1, 2025

Kubernetes nodes show low actual resource usage but cannot schedule new pods because resource requests have consumed most allocatable capacity. This causes cluster scaling when actual utilization is low, leading to wasted capacity and unnecessary costs.

How to detect:

Compare sum of kubernetes_cpu_requested and kubernetes_memory_requested across pods on a node against kubernetes_cpu_capacity and kubernetes_memory_capacity. If requested resources are >80% of capacity but actual kubernetes_cpu_usage and kubernetes_memory_usage are <50%, this indicates over-reservation. New pods fail to schedule despite available resources.

Recommended action:

Right-size resource requests based on actual usage patterns. Set requests to match typical usage (P50-P75) rather than peak usage. Use Vertical Pod Autoscaler (VPA) in recommendation mode to identify over-provisioned workloads. Consider cluster autoscaling policies that account for both requested and actual resource utilization.