Drive Performance Outlier Detection
criticalIn distributed MinIO deployments, a single slow or failing drive can bottleneck all write operations due to erasure coding requirements, but standard monitoring may not surface per-drive latency.
Use dperf or fio to baseline individual drive throughput. Alert when any single drive shows >30% lower throughput than peers, or when drive latency (p99) exceeds 2x cluster median. Monitor for drives with increasing error counts or timeouts in MinIO logs.
Run 'mc admin speedtest drive' or use dperf to identify slow drives. Check drive health with smartctl. Verify drive is properly connected (check NUMA/PCIe topology with hwloc-ls). Replace failing drives immediately. Use 'mc support diag' to collect detailed drive diagnostics for MinIO support analysis.