Re: Discuss: AIP-67 (multi team) now that AIP-82 (External event driven dags) exists

Vikram Koka Sun, 22 Jun 2025 19:37:16 -0700

This is a very difficult thread to catch up on.
I will take a detailed look at the AIP update to try to figure out the
changes in the proposal.


>From a quick glance, the updated AIP didn't seem to have any reference to
AIP-82, which surprised me, but will take a more detailed read through.

Vikram



On Sun, Jun 22, 2025 at 1:44 AM Pavankumar Gopidesu <gopidesupa...@gmail.com>
wrote:

> Thanks Jarek, that's a great update on this AIP, now it's much more slim
> down.
>
> left a minor comment. :) Overall looking great.
>
> Pavan
>
> On Sat, Jun 21, 2025 at 3:10 PM Jens Scheffler <j_scheff...@gmx.de.invalid
> >
> wrote:
>
> > Thanks for the rework/update of the AIP-72!
> >
> > Just a few small comments but overall I like it as it is much leaner
> > than originally planned and is in a level of complexity that it really
> > seems to be a benefit to close the gap as described.
> >
> > On 21.06.25 14:52, Jarek Potiuk wrote:
> > > I updated the AIP - including architecture images and reviewed it
> (again)
> > > and corrected any ambiguities and places where it needed to be changed.
> > >
> > > I think the current state
> > >
> >
> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components
> > > - nicely describes the proposal.
> > >
> > > Comparing to the previous one:
> > >
> > > 1. The DB changes are far less intrusive - no ripple effect on Airflow
> > > 2. There is no need to merge configurations and provide different set
> of
> > > configs per team - we can add it later but I do not see why we need it
> in
> > > this simplified version
> > > 3. We can still configure a different set of executors per team - that
> is
> > > already implemented (we just need to wire it to the bundle -> team
> > mapping).
> > >
> > > I think it will be way simpler and faster to implement this way and it
> > > should serve as MVMT -> Minimum Viable Multi Team that we can give our
> > > users so that they can provide feedback.
> > >
> > > J.
> > >
> > >
> > >
> > >
> > > On Fri, Jun 20, 2025 at 8:33 AM Jarek Potiuk <ja...@potiuk.com> wrote:
> > >
> > >>
> > >>
> > >>> I like this iteration a bit more now for sure, thanks for being
> > receptive
> > >>> to feedback! :)
> > >>>
> > >>
> > >>> This now becomes quite close to what was proposing before, we now
> again
> > >>> have a team ID (which I think is really needed here, glad to see it
> > back)
> > >>> and it will be used for auth management, configuration specification,
> > etc
> > >>> but will be carried by Bundle instead of the dag model. Which as you
> > say
> > >>> “For that we will need to make sure that both api-server, scheduler
> and
> > >>> triggerer have access to the "bundle definition" (to perform the
> > mapping)"
> > >>> which honestly doesn’t feel too much different from the original
> > proposal
> > >>> we had last week of adding it to Dag table and ensuring it’s
> available
> > >>> everywhere. but either way I’m happy to meet in the middle and keep
> it
> > on
> > >>> Bundle if everyone else feels that’s a more suitable location.
> > >>>
> > >> I think the big difference is the "ripple effect" that was discussed
> in
> > >> https://lists.apache.org/thread/78vndnybgpp705j6sm77l1t6xbrtnt5c
> (and I
> > >> believe - correct me if I am wrong Ash - important trigger for the
> > >> discussion) so far what we wanted is to extend the primary key and it
> > would
> > >> ripple through all the pieces of Airflow -> models, API, UI etc. ...
> > >> However - we already have `bundle_name" and "bundle_version" in the
> Dag
> > >> model. So I think when we add a separate table where we map the bundle
> > to
> > >> the team, the "ripple effect" will be almost 0. We do not want to
> change
> > >> primary key, we do not want to change UI in any way (except filtering
> of
> > >> DAGs available based on your team - but that will be handled in Auth
> > >> Manager and will not impact UI in any way, I think that's a huge
> > >> simplification of the implementation, and if we agree to it - i think
> it
> > >> should speed up the implementation significantly. There are only a
> > limited
> > >> number of times where you need to look up the team_id - so having the
> > >> bundle -> team mapping in a separate table and having to look them up
> > >> should not be a problem. And it has much less complexity and
> > >> "ripple-effect" through the codebase (for example I could imagine 100s
> > or
> > >> thousands already written tests that would have to be adapted if we
> > changed
> > >> the primary key - where there will be pretty much zero impact on
> > existing
> > >> tests if we just add bundle -> team lookup table.
> > >>
> > >>
> > >>> One other thing I’d point out is that I think including executors per
> > >>> team is a very easy win and quite possible without much work. I
> already
> > >>> have much of the code written. Executors are already aware of Teams
> > that
> > >>> own them (merged), I have a PR open to have configuration per team
> > (with a
> > >>> quite simple and isolated approach, I believe you approved Jarek).
> The
> > last
> > >>> piece is updating the scheduling logic to route tasks from a
> particular
> > >>> Bundle to the correct executor, which shouldn’t be much work (though
> it
> > >>> would be easier if the Task models had a column for the team they
> > belong
> > >>> to, rather than having to look up the Dag and Bundle to get the
> team) I
> > >>> have a branch where I was experimenting with this logic already.
> > >>> Any who, long story short, I don’t think we necessarily need to
> remove
> > >>> this piece from the project's scope if it is already partly done and
> > not
> > >>> too difficult.
> > >>>
> > >> Yeah. I hear you here again. Certainly I would not want to just
> > >> **remove** it from the code. And, yep I totally forgot we have it in.
> > And
> > >> if we can make it in, easily (which it seems we can) - we can also
> > include
> > >> it in the first iteration. What I wanted to avoid really (from the
> > original
> > >> design) - again trying to simplify it, limit the changes, and speed up
> > >> implementation. And there is one "complexity" that I wanted to avoid
> > >> specifically - having to have separate , additional configuration per
> > team.
> > >> Not only because it complicates already complex configuration handling
> > (I
> > >> know we have PR for that) but mostly because if it is not needed, we
> can
> > >> simplify documentation and explain to our users easier what they need
> > to do
> > >> to have their own multi-team setup. And I am quite open to keeping
> > >> multiple-executors if we can avoid complicating configuration.
> > >>
> > >> But I think some details of that and whether we really need separate
> > >> configuration might also come as a result of updating the AIP - I am
> not
> > >> quite sure now if we need it, but we can discuss it when we iterate on
> > the
> > >> AIP.
> > >>
> > >> J.
> > >>
> > >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
> > For additional commands, e-mail: dev-h...@airflow.apache.org
> >
> >
>

Re: Discuss: AIP-67 (multi team) now that AIP-82 (External event driven dags) exists

Reply via email to