So, I propose, DAG.max_active_tasks should be evaluated per-dag-run.  And
we can change the name accordingly if folks on board.

Agree. And possibly we name it DAG.max_active_tasks_per_run

J.


On Fri, Oct 4, 2024 at 3:30 PM Ryan Hatter
<ryan.hat...@astronomer.io.invalid> wrote:

> I think I agree with this:
>
> I feel it should be applied at the dag *run* scope
> > and not across all dag runs.
> >
>
> Just a thought: If someone *did* want to run multiple DAG runs at the same
> time and limit the max active tasks per DAG, they could create a pool for
> that DAG and pass the pool in default_args.
>
> On Fri, Oct 4, 2024 at 1:51 PM Daniel Standish
> <daniel.stand...@astronomer.io.invalid> wrote:
>
> > Ok, sorry, these concurrency settings are confusing.
> >
> > Let me clarify.
> >
> > `max_active_tasks_per_dag` is a core airflow setting and it provides the
> > default for DAG.max_active_tasks.
> >
> > DAG.max_active_tasks I think is a reasonable config to have but the
> problem
> > in my view is the scope.  I feel it should be applied at the dag *run*
> > scope
> > and not across all dag runs.  That just gets into confusing and
> footgunish
> > territory if you allow many concurrent dag runs but limit the number of
> > concurrent tasks.  Then you might have many many dags running but all
> > limping along.
> >
> > So I guess let me change my proposal.  I would propose that we have
> > DAG.max_active_tasks be applied at the dag *run* scope.  Not limiting
> > concurrency across all dag runs.
> >
> > I think in practice this is essentially what it already is, because I
> would
> > expect that the vast majority of dag runs are the only dag run running
> for
> > a given dag at a given time.  It's only when you have many dag runs of
> the
> > same dag running that this parameter ends up meaning something different.
> >
> > So, I propose, DAG.max_active_tasks should be evaluated per-dag-run.  And
> > we can change the name accordingly if folks on board.
> >
> > Now whether a mapped task is a task or not, I leave that for another day
> :)
> >
> >
> >
> >
> >
> >
> > On Fri, Oct 4, 2024 at 10:28 AM Daniel Standish <
> > daniel.stand...@astronomer.io> wrote:
> >
> > > The setting  max_active_tasks_per_dag seems mostly useless to me / and
> > > footgunish.
> > >
> > > Why?
> > >
> > > Because you already have a setting for max active dag runs.  If you
> don't
> > > want to run more tasks, don't create the extra dag runs.
> > >
> > > We also already have a mechanism (param on base operator) for limiting
> > > individual tasks across all dag runs where that may be needed.  But
> just
> > a
> > > general "i don't want more than 16 tasks running across all dag runs of
> > all
> > > types and for all tasks" seems just, imprecise and not useful.
> > >
> > > I actually think it makes sense to remove this param entirely.  But at
> > > least we should remove the default.
> > >
> > > WDYT
> > >
> >
>

Reply via email to