My comment on the name is for the suggested component that runs the
workload. It's not about the feature itself. I just suggest a more generic
name so if the need comes it would be easier to execute different kind of
workloads on it (like callbacks).

As for reuse the Triggerer I am not a fan of that. It serve a completely
different porpuse and combining both cases may result in poor usage of auto
scaling. I don't think alerts/callbacks/other "misc" should compete on the
same resources as actual tasks.

בתאריך יום ה׳, 22 במאי 2025, 16:19, מאת Jarek Potiuk ‏<ja...@potiuk.com>:

> How about Option 3) making it part of triggerer.
>
> I think that goes in the direction we've been discussing in the past where
> we have 'generic workload" that we can submit from any of the other
> components that will be executed in triggerer.
>
> * that would not add too much complexity - no extra process to manage
> * triggerer is obligatory part of installation now anyway
> * usually machines today have more processors and triggerer, with its event
> loop does not seem to be too busy in terms of multi-processor usage (there
> are extra processes accessing the DB but still not much I think). It could
> fork another process to run just deadline checks.
> * re - multi-team it's even easier, triggerer is already going to be
> "per-team".
> * we could even rename triggerer to "generic workload processor" (well
> shorter name, but to indicate that it could process any kind of workloads -
> not only deferred triggers).
>
> Re: comments from Elad:
>
> 1) Naming wise: I think we settled on the name already (looong discussion,
> naming is hard) and I think the scope of it is just really "deadlines" (we
> also wanted to distinguish it from SLA) - i like the name for this
> particular callback type, but yes - I agree it should be more generic, open
> for any future types of callbacks. If we go for triggerer handling "generic
> workload" - that is IMHO "generic enough" to handle any future workloads
>
> 2) I believe this is something that could be handled by the callback.
> Callback could have the option to be able to submit "cancel" request for
> the task it is called back for (via task.sdk API)  - but that should be up
> to the one who writes the callback.
>
> J.
>
>
>
>
>
>
> On Thu, May 22, 2025 at 10:03 AM Elad Kalif <elad...@apache.org> wrote:
>
> > I prefer option 2 but I have questions.
> > 1. Naming wise maybe we should prefer a more generic name as I am not
> sure
> > if it should be limited to deadlines? (maybe should be shared with
> > executing callbacks?)
> > 2. How do you plan to manage the queue of alerts? What happens if the
> > process is unhealthy while workers continue to execute tasks?
> >
> > On Thu, May 22, 2025 at 12:56 AM Ryan Hatter
> > <ryan.hat...@astronomer.io.invalid> wrote:
> >
> > > +1 for option 2, primarily because of:
> > >
> > >  It would be more robust and resilient, and therefore be able to run
> the
> > > > callbacks *even in presence of certain kinds of issues like the
> > scheduler
> > > > being bogged-down*
> > >
> > >
> > > On Wed, May 21, 2025 at 5:09 PM Kataria, Ramit
> > <ramit...@amazon.com.invalid
> > > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I’m working with Dennis on Deadline Alerts (AIP-86). I'd like to
> > discuss
> > > > implementation approaches for executing callbacks when Deadline
> Alerts
> > > are
> > > > triggered. As you may know, the old SLA feature has been removed, and
> > > we're
> > > > planning to introduce Deadline Alerts as a replacement in 3.1. When a
> > > > deadline is missed, we need a mechanism to execute callbacks (which
> > could
> > > > be notifications or other actions).
> > > >
> > > > I’ve identified two main approaches:
> > > >
> > > > Option 1: Scheduler-based
> > > > In this approach, the scheduler would check on a regular interval to
> > see
> > > > if the earliest deadline has passed and then queue the callback to
> run
> > in
> > > > an executor (local or remote). The executor would be specified when
> > > > creating the deadline alert and if there’s none specified, then the
> > > default
> > > > executor would be used.
> > > >
> > > > Option 2: New DeadlineProcessor process
> > > > In this approach, there would be a new process similar to
> > > > triggerer/dag-processor completely independent from the scheduler to
> > > check
> > > > for deadlines on a regular interval and also run the callbacks
> without
> > > > queueing it in another executor.
> > > >
> > > > Multi-team considerations: For multi-team later this year, option 2
> > would
> > > > be relatively simple to implement. However, for option 1, the
> callbacks
> > > > would have to run on a remote executor since there would be no local
> > > > executor.
> > > >
> > > > I recommend going with option 2 because:
> > > >
> > > >   *   It would be more robust and resilient, and therefore be able to
> > run
> > > > the callbacks even in presence of certain kinds of issues like the
> > > > scheduler being bogged-down
> > > >   *   It would also run the callbacks almost instantly instead of
> > having
> > > > to wait for an executor (especially if there’s a long queue of tasks
> > or a
> > > > cold-start delay)
> > > >      *   This could be mitigated by implementing a priority system
> > where
> > > > the deadline callbacks are prioritized over regular tasks but this
> is a
> > > > non-trivial problem with my current understanding of Airflow’s
> > > architecture
> > > >   *   It would avoid a potential slight increase in workload for the
> > > > scheduler
> > > >      *   The additional workload in the scheduler for option 1 would
> be
> > > > checking to see if the earliest deadline has passed on a regular
> > interval
> > > >
> > > > However, it would introduce another process for admins to deploy and
> > > > manage, and also likely require more effort to implement, therefore
> > > taking
> > > > longer to complete.
> > > >
> > > > So, I’d like to hear your thoughts on these approaches, anything I
> may
> > > > have missed and if you agree/disagree with this direction. Thank you
> > for
> > > > your input!
> > > >
> > > >
> > > > Best,
> > > >
> > > > Ramit Kataria
> > > > SDE at AWS
> > > >
> > >
> >
>

Reply via email to