Server becomes inaccessible during high CPU incidents
warningavailabilityUpdated Mar 24, 2023(via Exa)
Technologies:
How to detect:
When CPU utilization spikes, servers hosting Incorta/Spark may become unreachable, preventing real-time diagnostic command execution. The issue may resolve before operators can investigate, leaving no trace of the root cause.
Recommended action:
Pre-configure crontab scripts to automatically collect diagnostic data (ps, top -H, jstack) at regular intervals before accessibility is lost. Ensure script run intervals are shorter than typical incident duration to capture the problematic state.