TrinoKubernetes

Worker hangs in terminating state when coordinator shuts down first

warning
availabilityUpdated Jun 24, 2025(via Exa)
How to detect:

In Kubernetes deployments, Trino workers remain stuck in terminating state for up to 40 minutes when the coordinator terminates before workers complete shutdown. Workers continuously retry failed announcements to the unavailable coordinator, receiving HTTP 503 errors until pod termination timeout kills them.

Recommended action:

Ensure coordinator shutdown is delayed until all workers complete their shutdown sequence. In custom orchestration systems, coordinate shutdown order so workers terminate before coordinator. Alternatively, adjust pod termination grace periods or implement retry limits during worker shutdown phase to avoid indefinite retry loops.