Google GKEKubernetes

High Latency from Slow API Server or Scheduler

warning
latencyUpdated Feb 23, 2026

Elevated apiserver_request_duration_seconds and apiserver_request_total errors indicate API server overload or scheduler bottlenecks, causing slow pod scheduling, kubectl timeouts, and degraded cluster responsiveness.

How to detect:

Monitor apiserver_request_duration_seconds for increasing p95/p99 latency and apiserver_request_total for elevated error rates. Correlate with scheduler metrics to identify whether scheduling latency is contributing to overall control plane delays.

Recommended action:

Review control plane observability metrics in the Google Cloud Console under the cluster's Observability tab. If API server is overloaded, consider optimizing client request patterns, reducing watch/list operations, or scaling control plane resources (for larger clusters). If scheduler is slow, investigate pod affinity rules and node selection constraints.