Azure AKSKubernetes

Azure AKS network connectivity issues cause daemon API call hangs

critical
Connection ManagementUpdated Apr 1, 2025(via Exa)
How to detect:

Intermittent AKS connectivity issues cause daemon to hang waiting for external API requests to Postgres or Kubernetes, preventing job dequeuing. Issue affects AKS versions 1.29.7, 1.31.5, and 1.31.6.

Recommended action:

Apply dagster-k8s stability fix from PR #28934 for AKS network issues. Upgrade to dagster 1.10.13+ which includes merged PR #28924. Monitor AKS global connectivity status and related issues (reference: AKS issue #4904). Check for dropped connections when calling create_namespaced_job (issue #28314).