Kubernetes pods remain Pending and never start after 60 seconds

critical

availabilityUpdated Nov 18, 2022(via Exa)

Sources

Agent flag prefetch-seconds does not work as expected for k8s infra · Issue #7584 · PrefectHQ/prefectgithub.com

Technologies:

Prefectsubject

KubernetesThe root cause of this issue originates in Kubernetes

How to detect:

Kubernetes pods created for Prefect flow runs remain in Pending status for 60 seconds and then fail with 'Pod never started' error, causing flow run failures. This indicates pod scheduling issues in the Kubernetes cluster.

Recommended action:

Investigate Kubernetes cluster capacity and scheduling: check node availability and resources with kubectl get nodes, verify resource requests/limits in Base Job Manifest are not too high, check for image pull errors with kubectl describe pod, ensure sufficient cluster capacity exists. Review pod events for specific scheduling failure reasons.