Insufficient Resources Leading to Query Queueing

critical

Resource ContentionUpdated Jan 1, 2025

Presto coordinator unable to find nodes to run queries, indicated by 'No nodes available to run the query' errors, combined with increasing queued queries while running queries remain stable or decrease.

Sources

Monitoring PrestoDB Database | Apache HertzBeathertzbeat.apache.org

Presto Query Issues docs.qubole.com

Technologies:

PrestoSymptoms of this issue are visible in Presto metrics and logs

How to detect:

Monitor presto_execution_insufficient_resources_failures_one_minute_rate increasing. Check cluster status metrics where queuedQueries rises while runningQueries plateaus or drops. Verify worker node health - executor_pool_size and executor_active counts may be low or zero. Check if processor CPU and memory utilization on workers is at capacity.

Recommended action:

Scale up worker node count or increase worker instance sizes. Verify worker daemons are running (check server.log). If coordinator is undersized, upgrade coordinator node to handle heartbeat collection. Check for out-of-memory errors killing workers. Review and adjust resource allocation policies.