Query Node CPU Spike During Search-Phase Transition
warningWhen Milvus transitions from index building to concurrent searches, CPU usage spikes significantly (from ~21 cores to 28 cores peak), creating a temporary bottleneck that queues subsequent requests and increases in-queue latency.
Monitor CPU core utilization across query nodes during workload transitions. Watch for sharp spikes in CPU usage (>30% increase) when concurrent search load begins, accompanied by rising in-queue latency metrics. This pattern appears as a distinct peak in CPU metrics at the boundary between indexing and search phases.
Scale out query nodes to handle transition load spikes, or implement request rate limiting during known transition periods. Pre-warm query nodes before switching workloads. Consider dedicating separate query nodes for search vs. indexing operations to avoid resource contention.