TeradataApache Spark

Server becomes inaccessible during high CPU incidents

warning
availabilityUpdated Mar 24, 2023(via Exa)
How to detect:

When CPU utilization spikes, servers hosting Incorta/Spark may become unreachable, preventing real-time diagnostic command execution. The issue may resolve before operators can investigate, leaving no trace of the root cause.

Recommended action:

Pre-configure crontab scripts to automatically collect diagnostic data (ps, top -H, jstack) at regular intervals before accessibility is lost. Ensure script run intervals are shorter than typical incident duration to capture the problematic state.