Premature caching agent rescheduling due to timeout

warning

performanceUpdated Oct 6, 2025(via Exa)

Sources

Configure Spinnaker's Usage of Redis | Spinnakerspinnaker.io

Technologies:

RedisRedis metrics correlate with this issue and help confirm diagnosis

How to detect:

Clouddriver's caching agents are prematurely rescheduled when they take longer than the default 300-second (5-minute) timeout to complete their cache cycles. This causes duplicate agent runs, increased load on cloud provider APIs, and inefficient resource utilization without actual agent failures.

Recommended action:

Increase redis.poll.timeoutSeconds in ~/.hal/$DEPLOYMENT/profiles/clouddriver-local.yml if monitoring shows agents consistently taking longer than 5 minutes to complete successfully. Adjust the timeout to exceed typical agent completion times while maintaining reasonable bounds for detecting truly hung agents.