Thanks for the rework/update of the AIP-72!
Just a few small comments but overall I like it as it is much leaner
than originally planned and is in a level of complexity that it really
seems to be a benefit to close the gap as described.
On 21.06.25 14:52, Jarek Potiuk wrote:
I updated the AIP - including architecture images and reviewed it (again)
and corrected any ambiguities and places where it needed to be changed.
I think the current state
https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components
- nicely describes the proposal.
Comparing to the previous one:
1. The DB changes are far less intrusive - no ripple effect on Airflow
2. There is no need to merge configurations and provide different set of
configs per team - we can add it later but I do not see why we need it in
this simplified version
3. We can still configure a different set of executors per team - that is
already implemented (we just need to wire it to the bundle -> team mapping).
I think it will be way simpler and faster to implement this way and it
should serve as MVMT -> Minimum Viable Multi Team that we can give our
users so that they can provide feedback.
J.
On Fri, Jun 20, 2025 at 8:33 AM Jarek Potiuk <ja...@potiuk.com> wrote:
I like this iteration a bit more now for sure, thanks for being receptive
to feedback! :)
This now becomes quite close to what was proposing before, we now again
have a team ID (which I think is really needed here, glad to see it back)
and it will be used for auth management, configuration specification, etc
but will be carried by Bundle instead of the dag model. Which as you say
“For that we will need to make sure that both api-server, scheduler and
triggerer have access to the "bundle definition" (to perform the mapping)"
which honestly doesn’t feel too much different from the original proposal
we had last week of adding it to Dag table and ensuring it’s available
everywhere. but either way I’m happy to meet in the middle and keep it on
Bundle if everyone else feels that’s a more suitable location.
I think the big difference is the "ripple effect" that was discussed in
https://lists.apache.org/thread/78vndnybgpp705j6sm77l1t6xbrtnt5c (and I
believe - correct me if I am wrong Ash - important trigger for the
discussion) so far what we wanted is to extend the primary key and it would
ripple through all the pieces of Airflow -> models, API, UI etc. ...
However - we already have `bundle_name" and "bundle_version" in the Dag
model. So I think when we add a separate table where we map the bundle to
the team, the "ripple effect" will be almost 0. We do not want to change
primary key, we do not want to change UI in any way (except filtering of
DAGs available based on your team - but that will be handled in Auth
Manager and will not impact UI in any way, I think that's a huge
simplification of the implementation, and if we agree to it - i think it
should speed up the implementation significantly. There are only a limited
number of times where you need to look up the team_id - so having the
bundle -> team mapping in a separate table and having to look them up
should not be a problem. And it has much less complexity and
"ripple-effect" through the codebase (for example I could imagine 100s or
thousands already written tests that would have to be adapted if we changed
the primary key - where there will be pretty much zero impact on existing
tests if we just add bundle -> team lookup table.
One other thing I’d point out is that I think including executors per
team is a very easy win and quite possible without much work. I already
have much of the code written. Executors are already aware of Teams that
own them (merged), I have a PR open to have configuration per team (with a
quite simple and isolated approach, I believe you approved Jarek). The last
piece is updating the scheduling logic to route tasks from a particular
Bundle to the correct executor, which shouldn’t be much work (though it
would be easier if the Task models had a column for the team they belong
to, rather than having to look up the Dag and Bundle to get the team) I
have a branch where I was experimenting with this logic already.
Any who, long story short, I don’t think we necessarily need to remove
this piece from the project's scope if it is already partly done and not
too difficult.
Yeah. I hear you here again. Certainly I would not want to just
**remove** it from the code. And, yep I totally forgot we have it in. And
if we can make it in, easily (which it seems we can) - we can also include
it in the first iteration. What I wanted to avoid really (from the original
design) - again trying to simplify it, limit the changes, and speed up
implementation. And there is one "complexity" that I wanted to avoid
specifically - having to have separate , additional configuration per team.
Not only because it complicates already complex configuration handling (I
know we have PR for that) but mostly because if it is not needed, we can
simplify documentation and explain to our users easier what they need to do
to have their own multi-team setup. And I am quite open to keeping
multiple-executors if we can avoid complicating configuration.
But I think some details of that and whether we really need separate
configuration might also come as a result of updating the AIP - I am not
quite sure now if we need it, but we can discuss it when we iterate on the
AIP.
J.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
For additional commands, e-mail: dev-h...@airflow.apache.org