Technologies/Prefect/prefect.flow_run.crash
PrefectPrefectMetric

prefect.flow_run.crash

Flow runs crashing unexpectedly
Dimensions:None
Available on:Native (1)
Interface Metrics (1)
Native
Total number of flow runs that crashed unexpectedly
Dimensions:None

Technical Annotations (64)

Configuration Parameters (9)
PREFECT_CLIENT_RETRY_EXTRA_CODESrecommended: 500,421
Enables retry logic for HTTP 500 responses to prevent worker crashes
concurrency_limitrecommended: 5-100 depending on capacity
Per-deployment cap prevents runaway concurrent executions
for_eachrecommended: ['prefect.resource.id'] or ['client_id']
Deduplicates runs per resource or client during floods
cluster-autoscaler.kubernetes.io/safe-to-evictrecommended: false
Prevents pod eviction during node scale-down
PREFECT_API_URLrecommended: https://api.prefect.cloud/api/accounts/...
Must point to Prefect Cloud API with correct account/workspace IDs
PREFECT_API_KEY
Must be valid API key (pnu_ for users, pnb_ for service accounts)
Base Job Manifest
custom Kubernetes job configuration that may specify resource requests/limits
pool_size
SQLAlchemy connection pool size per Prefect process; limits base connections
max_overflow
SQLAlchemy max additional connections beyond pool_size; caps connection burst
Error Signatures (11)
500 Internal Server Errorhttp status
prefect.exceptions.PrefectHTTPStatusErrorexception
unhandled errors in a TaskGroupexception
TypeError: 'MockValSer' object cannot be converted to 'SchemaSerializer'exception
ExceptionGroup: unhandled errors in a TaskGroupexception
Pod never startedlog pattern
Pod has status 'Pending'log pattern
remaining connection slots are reserved for non-replication superuser connectionslog pattern
asyncpg.exceptions.TooManyConnectionsErrorexception
connection-limit reached errorexception
httpx crashexception
CLI Commands (6)
prefect work-pool set-concurrency-limit my-pool 5remediation
prefect work-queue set-concurrency-limit my-queue 5 --pool my-poolremediation
prefect config viewdiagnostic
prefect cloud workspace lsdiagnostic
kubectl describe poddiagnostic
kubectl get nodesdiagnostic
Technical References (38)
ETL Task3componentlakefs_test_latest_file.csvfile pathread_deploymentcomponent_submit_runcomponent_check_flow_runcomponentconcurrency_limitcomponentfor_eachcomponentbackpressureconceptMockValSercomponentSchemaSerializercomponent_mark_flow_run_as_cancelledcomponentset_flow_run_statecomponentrunnercomponentghost runsconceptKueuecomponentSIGTERMconceptSIGKILLconceptkubernetes-jobcomponentPendingconceptzombie flow runconceptterminal stateconceptCrashedcomponentLazarus ServicecomponentFailed stateconceptCloud hookscomponentagentcomponentworkercomponentflow runconcepton_failureconceptasyncpgcomponentSQLAlchemycomponentpgbouncercomponentprefect-servercomponentprefect-workercomponentrun_deploymentcomponenthttpxcomponentstate handlersconceptAutomations APIcomponent
Related Insights (16)
Pipeline task fails when required data column is missing from input filecritical
Worker exits unexpectedly when API server returns HTTP 500critical
Missing pipeline metrics delay data quality issue detectionwarning
Alert flooding kills workers without concurrency limitscritical
Pydantic MockValSer serialization error prevents container flow startupcritical
Ghost runs persist when runners die without server notificationcritical
Kubernetes autoscaler evicts running Prefect jobs during scale-downwarning
Incorrect API configuration causes flow run failurescritical
Empty flow logs indicate infrastructure startup failurecritical
Kubernetes pods remain Pending and never start after 60 secondscritical
Zombie flow runs prevent terminal state due to infrastructure or network failurescritical
Distressed flows fail permanently after 3 Lazarus retry cyclescritical
Flow run submission failure sets incorrect Failed state instead of Crashedwarning
PostgreSQL connection pool exhaustion causes Prefect server errors and worker crashescritical
Connection limit reached when orchestrating parallel sub-flows via run_deploymentcritical
False crash notifications triggered during resource provisioning delaywarning