Lineage Impact Blast Radius Unknown During Incidents
warningWhen upstream data quality issues occur, teams cannot quickly identify downstream impact (which dashboards, ML models, reports are affected) because column-level lineage is incomplete or not queried during incident response.
During active data quality incidents (assertion failures, schema changes), measure time-to-impact-assessment - how long it takes to identify affected downstream assets. Detect when incidents are raised (via DataHub incidents API) but downstream dependencies are not automatically notified. Monitor lineage completeness for critical datasets by checking if ingestion sources are capturing column-level lineage.
Enable column-level lineage ingestion for all critical data sources (SQL databases, BI tools, data pipelines). Configure DataHub to automatically raise incidents on downstream assets when upstream quality checks fail. Implement Slack notifications to dataset owners showing blast radius (list of affected dashboards/models) when incidents occur. Use DataHub GraphQL API to query lineage during incident triage - trace from failed dataset through transformations to consumer reports. Document runbooks that include lineage queries as first step in incident response. Train on-call teams to use DataHub lineage visualization to assess impact within first 5 minutes of incident detection.