I updated the AIP - including architecture images and reviewed it (again) and corrected any ambiguities and places where it needed to be changed.
I think the current state https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components - nicely describes the proposal. Comparing to the previous one: 1. The DB changes are far less intrusive - no ripple effect on Airflow 2. There is no need to merge configurations and provide different set of configs per team - we can add it later but I do not see why we need it in this simplified version 3. We can still configure a different set of executors per team - that is already implemented (we just need to wire it to the bundle -> team mapping). I think it will be way simpler and faster to implement this way and it should serve as MVMT -> Minimum Viable Multi Team that we can give our users so that they can provide feedback. J. On Fri, Jun 20, 2025 at 8:33 AM Jarek Potiuk <ja...@potiuk.com> wrote: > > > >> >> I like this iteration a bit more now for sure, thanks for being receptive >> to feedback! :) >> > > >> This now becomes quite close to what was proposing before, we now again >> have a team ID (which I think is really needed here, glad to see it back) >> and it will be used for auth management, configuration specification, etc >> but will be carried by Bundle instead of the dag model. Which as you say >> “For that we will need to make sure that both api-server, scheduler and >> triggerer have access to the "bundle definition" (to perform the mapping)" >> which honestly doesn’t feel too much different from the original proposal >> we had last week of adding it to Dag table and ensuring it’s available >> everywhere. but either way I’m happy to meet in the middle and keep it on >> Bundle if everyone else feels that’s a more suitable location. >> > > I think the big difference is the "ripple effect" that was discussed in > https://lists.apache.org/thread/78vndnybgpp705j6sm77l1t6xbrtnt5c (and I > believe - correct me if I am wrong Ash - important trigger for the > discussion) so far what we wanted is to extend the primary key and it would > ripple through all the pieces of Airflow -> models, API, UI etc. ... > However - we already have `bundle_name" and "bundle_version" in the Dag > model. So I think when we add a separate table where we map the bundle to > the team, the "ripple effect" will be almost 0. We do not want to change > primary key, we do not want to change UI in any way (except filtering of > DAGs available based on your team - but that will be handled in Auth > Manager and will not impact UI in any way, I think that's a huge > simplification of the implementation, and if we agree to it - i think it > should speed up the implementation significantly. There are only a limited > number of times where you need to look up the team_id - so having the > bundle -> team mapping in a separate table and having to look them up > should not be a problem. And it has much less complexity and > "ripple-effect" through the codebase (for example I could imagine 100s or > thousands already written tests that would have to be adapted if we changed > the primary key - where there will be pretty much zero impact on existing > tests if we just add bundle -> team lookup table. > > >> One other thing I’d point out is that I think including executors per >> team is a very easy win and quite possible without much work. I already >> have much of the code written. Executors are already aware of Teams that >> own them (merged), I have a PR open to have configuration per team (with a >> quite simple and isolated approach, I believe you approved Jarek). The last >> piece is updating the scheduling logic to route tasks from a particular >> Bundle to the correct executor, which shouldn’t be much work (though it >> would be easier if the Task models had a column for the team they belong >> to, rather than having to look up the Dag and Bundle to get the team) I >> have a branch where I was experimenting with this logic already. >> Any who, long story short, I don’t think we necessarily need to remove >> this piece from the project's scope if it is already partly done and not >> too difficult. >> > > Yeah. I hear you here again. Certainly I would not want to just > **remove** it from the code. And, yep I totally forgot we have it in. And > if we can make it in, easily (which it seems we can) - we can also include > it in the first iteration. What I wanted to avoid really (from the original > design) - again trying to simplify it, limit the changes, and speed up > implementation. And there is one "complexity" that I wanted to avoid > specifically - having to have separate , additional configuration per team. > Not only because it complicates already complex configuration handling (I > know we have PR for that) but mostly because if it is not needed, we can > simplify documentation and explain to our users easier what they need to do > to have their own multi-team setup. And I am quite open to keeping > multiple-executors if we can avoid complicating configuration. > > But I think some details of that and whether we really need separate > configuration might also come as a result of updating the AIP - I am not > quite sure now if we need it, but we can discuss it when we iterate on the > AIP. > > J. > >