Just adding the [DISCUSS] prefix, which I forgot to add.

On Thu, Oct 3, 2024 at 4:23 PM Daniel Standish <
daniel.stand...@astronomer.io> wrote:

> Ok so, I'm thinking through what makes sense re concurrency control in
> backfill.
>
> It was referred to
> <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=311627729#AIP78Schedulermanagedbackfill-Otherideasunderconsideration>
> in the AIP but I didn't define the behavior:
>
> Other ideas under consideration
>>
>>    - Add extra concurrency control on dag run
>>
>>
>>    - Apply max active dag runs separately for backfill
>>
>>
>>    - Override any dag param in creating the backfill job and it’s only
>>    applied in that scope
>>
>>
>>
> As I have proceeded with implementation, here's what I went with:
>
> Each "backfill" gets its own concurrency control ("max_active_runs") that
> is evaluated completely separate from the DAG scope max_active_runs
>
> So if DAG max active runs is 2, and the backfill max active runs is 1,
> then you can have max of 3 concurrent runs.  Your non-backfill dags cannot
> starve out the backfill ones, and backfill dag runs cannot starve out the
> non-backfill ones.
>
> The other way to go is to say that DAG.max_active_runs is global.  This
> does not feel quite right to me cus it gets a bit murky.  E.g. what happens
> if DAG.max is 10 and Backfill.max is 10.  Do you allow it?  What do you do
> to avoid starving out non-backfill runs?
>
> What do people think?  Relevant PR is here
> <https://github.com/apache/airflow/pull/42686>.
>

Reply via email to