Cross-Cloud Migration Planning and Validation

infoMigration

Planning migration of Redis from one cloud provider to another while ensuring compatibility, minimizing downtime, and validating data integrity.

Prompt: We're migrating our Redis deployment from AWS ElastiCache to Azure Cache for Redis. I need to understand the differences between the platforms - are there metrics that are named differently or work differently? What's the best migration approach to minimize downtime and how do I validate that all data migrated correctly?

Agent Playbook

When an agent encounters this scenario, Schema provides these diagnostic steps automatically.

When migrating Redis across cloud providers, start by establishing baseline metrics on your source system to understand data volume, memory requirements, and persistence configuration. Then validate replication health before taking snapshots, plan a dual-write or replication bridge strategy to minimize downtime, and thoroughly validate data integrity post-migration by comparing key counts and memory usage. Finally, test application connectivity and monitor persistence operations to ensure the new platform handles your workload correctly.

1Establish baseline metrics on source ElastiCache
Before starting migration, capture current state from your AWS ElastiCache instance. Check `redis.db.keys` to understand total key count (this becomes your validation target), `redis.memory.used` to size your Azure instance appropriately, and `redis.persistence.aof-enabled` to understand your durability model. These baselines are critical for validating migration completeness and ensuring you provision equivalent capacity on Azure. Document these values as your migration reference.
2Verify replication health and persistence status
Check `redis.replication.offset` on both primary and any replicas to ensure they're in sync—if offsets differ significantly, you have replication lag that could cause data loss during migration. Verify `redis.persistence.rdb-last-bgsave-status` shows 1 (success) and that `redis.persistence.rdb-last-bgsave-time-sec` is reasonable for your dataset size. The `elasticache-redis-stale-configuration` insight shows ElastiCache can have replication consistency issues, so this validation is especially important before taking your migration snapshot.
3Plan migration strategy to minimize downtime
For minimal downtime, use Redis replication to bridge AWS and Azure—configure your ElastiCache instance as primary and Azure Cache as replica, let it sync, then promote Azure. Alternatively, take an RDB snapshot during low-traffic window and restore to Azure. Given the `elasticache-redis-stale-configuration` insight showing configuration propagation delays across ElastiCache nodes, avoid migration during high-traffic periods and plan for potential inconsistency windows. Test your cutover procedure in a staging environment first.
4Validate data integrity post-migration
Immediately after migration, compare `redis.db.keys` on Azure Cache to your baseline—counts must match exactly. Check `redis.memory.used` is similar to baseline (some variance is expected due to platform overhead differences). Verify persistence is working by confirming `redis.persistence.rdb-last-bgsave-status` is 1 and `redis.persistence.aof-enabled` matches your source configuration. Any key count mismatch indicates incomplete migration and requires immediate rollback.
5Test application connectivity and authentication
Before cutover, verify applications can connect to Azure Cache for Redis on port 6379 (or your configured port). The `redis-broker-unavailable-prevents-celery` insight shows how critical basic connectivity is—Celery workers and other clients will fail completely if Redis isn't accessible. Test authentication (Azure uses access keys differently than ElastiCache), network security groups, and TLS settings. Run a smoke test with a non-production application to verify read/write operations work correctly.
6Monitor persistence operations on new platform
After cutover, monitor `redis.persistence.rdb-last-bgsave-time-sec` to ensure Azure Cache for Redis handles your dataset's persistence workload—if backup times are significantly longer than ElastiCache, you may need to adjust instance size. Check `redis.persistence.aof-current-size` growth rate if using AOF to ensure it's not growing unexpectedly (indicating rewrite issues). Azure and AWS may have different I/O characteristics that affect persistence performance.
7Verify replication offset stability post-cutover
If you're using Azure Cache for Redis with replicas, monitor `redis.replication.offset` for the first 24-48 hours to ensure replication is stable and offsets are advancing consistently across primary and replicas. Given the issues with ElastiCache replication lag mentioned in `elasticache-redis-stale-configuration`, verify Azure doesn't have similar consistency problems. Inconsistent offsets post-migration indicate replication problems that could lead to data loss during failover.

Technologies

Related Insights

Relevant Metrics

Monitoring Interfaces

Redis Native Metrics
Redis Datadog
Redis Prometheus