MySQL

Binary Log Spike Cascade

critical
storageUpdated Dec 30, 2024

Sudden spike in binary log disk usage triggers CPU, network, and IOPS saturation, creating a cascading slowdown. Large write operations (bulk updates, schema changes) generate excessive binlog data faster than it can be purged, leading to storage pressure and performance degradation.

How to detect:

mysql.binlog.disk_use increasing rapidly (>50% growth in <1 hour) while mysql.performance.cpu_time remains relatively low (<30%), correlated with spikes in mysql.client.network.io and mysql.innodb.os_file_writes

Recommended action:

Investigate binlog contents using mysqlbinlog to identify bulk operations (pt-online-schema-change, large UPDATE/DELETE statements). Query binlog_files to check number of binlog files. Increase binlog retention hours temporarily if needed for replication, but ensure adequate storage. Consider upgrading storage IOPS if sustained high write load. Monitor mysql.innodb.data_pending_writes and mysql.innodb.os_file_fsyncs for I/O bottleneck confirmation.