Slow OSD Operations Signal Performance Degradation
warningWhen OSDs report slow operations (requests taking >30s by default), it indicates disk I/O bottlenecks, journal issues, or network problems affecting cluster performance. This is one of the most common Ceph performance issues.
Monitor for slow operation warnings in cluster health output or logs. Check ceph_osd perf output for elevated commit/apply latencies. Use `ceph health detail` to identify OSDs with slow ops, and `ceph daemon osd.X dump_historic_slow_ops` to analyze patterns.
Identify the operation type (osd_op for disk I/O, osd_repop for network/replica issues). For disk issues, check disk latency and journal device health. For replica issues, verify network connectivity between OSDs. Consider tuning osd_op_threads or investigating specific OSD hardware problems.