Worker exits unexpectedly when API server returns HTTP 500

critical

availabilityUpdated Feb 5, 2025(via Exa)

Sources

Prefect Workers crash when server returns 500s · Issue #16977 · PrefectHQ/prefectgithub.com

Technologies:

Prefectsubject

KubernetesKubernetes metrics correlate with this issue and help confirm diagnosis

How to detect:

When self-hosted Prefect API server becomes overloaded and returns HTTP 500 Internal Server Error responses, Kubernetes workers exit unexpectedly instead of handling the error gracefully. This occurs during _submit_run -> _check_flow_run -> read_deployment and cancel_run -> _get_configuration -> read_deployment operations.

Recommended action:

Set PREFECT_CLIENT_RETRY_EXTRA_CODES environment variable to include 500 status codes (e.g., PREFECT_CLIENT_RETRY_EXTRA_CODES='500,421') to enable retry logic. Long-term: increase server deployment resources to prevent overload at scale (20k+ flow runs/day).