Apache FlinkKubernetes

TaskManager Capacity Ceiling

warning
scalingUpdated Aug 15, 2025

Insufficient registered TaskManagers or task slots prevent job scaling, causing underutilization and inability to handle load increases.

How to detect:

Monitor flink_jobmanager_registeredtaskmanagers and flink_jobmanager_taskslotstotal. When job parallelism configuration exceeds available slots, or when CPU/memory pressure is high (flink_taskmanager_status_jvm_cpu_load > 80%) with no room to scale, capacity is constrained. Compare running jobs (flink_jobmanager_runningjobs) against available resources.

Recommended action:

Enable auto-scaling if using Kubernetes or cloud-managed Flink to dynamically add TaskManagers. Manually increase TaskManager count or slots per TaskManager in configuration. Review maxParallelism settings to ensure scaling headroom. For multi-tenant platforms, implement resource quotas and monitoring to prevent noisy neighbor problems.