Hi all,

I'd like to raise a discussion on a new observability feature I've opened
in PR #68359 and wanted to get maintainer input before it moves further.

*Problem*

When a DAG runs with `catchup=False` and the scheduler advances past missed
data intervals (e.g., after a restart or re-enable), those skips are
silent. No DagRun rows are created, no hook fires. Users who need
visibility into which intervals were skipped have no in-process signal and
must compare `last_automated_data_interval` against wall clock externally.
This was raised in issue #66791.

*Proposed Solution*

The PR adds two symmetric extension points:

1. `on_skipped_intervals_callback` -- a DAG-level callback on the SDK `DAG`
constructor, using the same context-based signature as
`on_success_callback` / `on_failure_callback`. The context includes `dag`,
`reason` ("skipped_intervals"), and `skipped_range` (a `DataInterval` from
the previous automated run's `data_interval_end` to the new run's
`data_interval_start`).

2. `on_intervals_skipped` -- an AIP-61 listener hookspec for plugins that
need in-process observability without reloading the DAG file.

Skipped intervals are detected at scheduled DagRun creation time by
comparing the new run's `data_interval_start` against the prior run's
`data_interval_end`. The callback/listener only fires when `not
dag.catchup` AND a callback or listener is registered, so there is no
overhead for DAGs that do not opt in.

The callback is dispatched via `DagSkippedIntervalsCallbackRequest` ->
`DatabaseCallbackSink` -> DAG File Processor. The listener fires
synchronously in the scheduler.

Tests are passing locally covering scheduler detection, callback routing
and execution, listener firing, and serialization roundtrip.

*Questions for the Community:*

- Is this the right approach for this observability gap, or is there a
preferred pattern?
- Should this go through a formal AIP given it introduces a new DAG
constructor parameter and listener hookspec?

PR: https://github.com/apache/airflow/pull/68359
Original issue: https://github.com/apache/airflow/issues/66791

Happy to address any questions or adjust the approach based on feedback.

Best,
Teghveer Singh Ateliey

Reply via email to