I updated the AIP - including architecture images and reviewed it (again)
and corrected any ambiguities and places where it needed to be changed.

I think the current state
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components
- nicely describes the proposal.

Comparing to the previous one:

1. The DB changes are far less intrusive - no ripple effect on Airflow
2. There is no need to merge configurations and provide different set of
configs per team - we can add it later but I do not see why we need it in
this simplified version
3. We can still configure a different set of executors per team - that is
already implemented (we just need to wire it to the bundle -> team mapping).

I think it will be way simpler and faster to implement this way and it
should serve as MVMT -> Minimum Viable Multi Team that we can give our
users so that they can provide feedback.

J.




On Fri, Jun 20, 2025 at 8:33 AM Jarek Potiuk <ja...@potiuk.com> wrote:

>
>
>
>>
>> I like this iteration a bit more now for sure, thanks for being receptive
>> to feedback! :)
>>
>
>
>> This now becomes quite close to what was proposing before, we now again
>> have a team ID (which I think is really needed here, glad to see it back)
>> and it will be used for auth management, configuration specification, etc
>> but will be carried by Bundle instead of the dag model. Which as you say
>> “For that we will need to make sure that both api-server, scheduler and
>> triggerer have access to the "bundle definition" (to perform the mapping)"
>> which honestly doesn’t feel too much different from the original proposal
>> we had last week of adding it to Dag table and ensuring it’s available
>> everywhere. but either way I’m happy to meet in the middle and keep it on
>> Bundle if everyone else feels that’s a more suitable location.
>>
>
> I think the big difference is the "ripple effect" that was discussed in
> https://lists.apache.org/thread/78vndnybgpp705j6sm77l1t6xbrtnt5c (and I
> believe - correct me if I am wrong Ash - important trigger for the
> discussion) so far what we wanted is to extend the primary key and it would
> ripple through all the pieces of Airflow -> models, API, UI etc. ...
> However - we already have `bundle_name" and "bundle_version" in the Dag
> model. So I think when we add a separate table where we map the bundle to
> the team, the "ripple effect" will be almost 0. We do not want to change
> primary key, we do not want to change UI in any way (except filtering of
> DAGs available based on your team - but that will be handled in Auth
> Manager and will not impact UI in any way, I think that's a huge
> simplification of the implementation, and if we agree to it - i think it
> should speed up the implementation significantly. There are only a limited
> number of times where you need to look up the team_id - so having the
> bundle -> team mapping in a separate table and having to look them up
> should not be a problem. And it has much less complexity and
> "ripple-effect" through the codebase (for example I could imagine 100s or
> thousands already written tests that would have to be adapted if we changed
> the primary key - where there will be pretty much zero impact on existing
> tests if we just add bundle -> team lookup table.
>
>
>> One other thing I’d point out is that I think including executors per
>> team is a very easy win and quite possible without much work. I already
>> have much of the code written. Executors are already aware of Teams that
>> own them (merged), I have a PR open to have configuration per team (with a
>> quite simple and isolated approach, I believe you approved Jarek). The last
>> piece is updating the scheduling logic to route tasks from a particular
>> Bundle to the correct executor, which shouldn’t be much work (though it
>> would be easier if the Task models had a column for the team they belong
>> to, rather than having to look up the Dag and Bundle to get the team) I
>> have a branch where I was experimenting with this logic already.
>> Any who, long story short, I don’t think we necessarily need to remove
>> this piece from the project's scope if it is already partly done and not
>> too difficult.
>>
>
> Yeah. I hear you here again. Certainly I would not want to just
> **remove** it from the code. And, yep I totally forgot we have it in. And
> if we can make it in, easily (which it seems we can) - we can also include
> it in the first iteration. What I wanted to avoid really (from the original
> design) - again trying to simplify it, limit the changes, and speed up
> implementation. And there is one "complexity" that I wanted to avoid
> specifically - having to have separate , additional configuration per team.
> Not only because it complicates already complex configuration handling  (I
> know we have PR for that) but mostly because if it is not needed, we can
> simplify documentation and explain to our users easier what they need to do
> to have their own multi-team setup. And I am quite open to keeping
> multiple-executors if we can avoid complicating configuration.
>
> But I think some details of that and whether we really need separate
> configuration might also come as a result of updating the AIP - I am not
> quite sure now if we need it, but we can discuss it when we iterate on the
> AIP.
>
> J.
>
>

Reply via email to