Replica count undershoots or overshoots when concurrency mismatches batch size
warningperformanceUpdated Mar 24, 2026
Technologies:
How to detect:
For Services using adaptive batching or continuous batching, setting concurrency to a value different from the batch size results in suboptimal autoscaling, causing either underutilization or request queueing
Recommended action:
For Services with adaptive batching or continuous batching enabled, set the concurrency parameter to match the configured batch size. This aligns processing capacity with batch requirements and optimizes throughput. Review your batch configuration and update concurrency accordingly.