Worker exits unexpectedly when API server returns HTTP 500
criticalavailabilityUpdated Feb 5, 2025(via Exa)
Technologies:
How to detect:
When self-hosted Prefect API server becomes overloaded and returns HTTP 500 Internal Server Error responses, Kubernetes workers exit unexpectedly instead of handling the error gracefully. This occurs during _submit_run -> _check_flow_run -> read_deployment and cancel_run -> _get_configuration -> read_deployment operations.
Recommended action:
Set PREFECT_CLIENT_RETRY_EXTRA_CODES environment variable to include 500 status codes (e.g., PREFECT_CLIENT_RETRY_EXTRA_CODES='500,421') to enable retry logic. Long-term: increase server deployment resources to prevent overload at scale (20k+ flow runs/day).