ElastiCache Redis causes stale clouddriver account configuration and pipeline failures
criticalconfigurationUpdated Mar 24, 2026
How to detect:
After migrating from hal-created Redis pod to AWS ElastiCache Redis (r4.large with 1 read replica), random deployManifest failures occur due to inconsistent account configuration across clouddriver nodes. The hal-created Redis in Kubernetes performs better and provides consistent configuration.
Recommended action:
Revert to hal-created Redis pod deployed in Kubernetes cluster instead of using ElastiCache. If ElastiCache must be used, investigate cache consistency settings, replication lag between primary and read replica, and network latency between Spinnaker services and ElastiCache. Review clouddriver logs for 'Unable to run agents' and 'internal server errors'. Consider using S3 for persistent storage with in-cluster Redis for caching.