Re: Discuss: AIP-67 (multi team) now that AIP-82 (External event driven dags) exists

Jarek Potiuk Mon, 07 Jul 2025 12:56:52 -0700

To be honest I am glad you raised this argument - because it means that the
goal I had in mind has been achieved. People will not think that it is easy
to add more and more teams to the picture - the more they add, the more
messy it will get.


That's quite cool if you take into account what AIP-67 goal is.

pon., 7 lip 2025, 21:53 użytkownik Jarek Potiuk <ja...@potiuk.com> napisał:

> To be honest ugliness of it and the messiness of supporting many teams is
> deliberate choice. This is also to add friction in supporting 'many' teams.
> Supporting 'many' teams had never been a goal for multi-team even in the
> last incarnation. Due to scaling of scheduler especially - we should not
> give the impression that this solution can scale to 'many' teams.
>
> So that choice is pretty deliberate.
>
> pon., 7 lip 2025, 21:27 użytkownik Oliveira, Niko
> <oniko...@amazon.com.invalid> napisał:
>
>> Thanks for getting back to me and sharing a concrete example Jarek!
>>
>> I think overloading the executor field, yet again, with another dimension
>> is going to get quite messy. If you have many teams and many executors per
>> team you now have colons (serving two purposes), semi-colons and commas in
>> one config that will be tremendously long. I think this is a poor user
>> experience. Same goes for the executor configurations themselves.
>> And for someone who's looked through the code, I honestly and
>> respectfully, disagree that this is going to be simpler code to write and
>> maintain. I think it's actually cleaner and easier for both the users and
>> our code base to implement this with first class support, rather than
>> embedding team-ness into configs one-by-one and hard coding that
>> parsing/support across the code base in specific locations where those
>> particular configs are used. It's messy and does not scale well (it smells
>> of executor coupling problem we had years ago, this is going to be
>> team-coupling where random bits of our code base are going to have to have
>> these hacks instead of a first class solution flowing through more
>> transparently).
>>
>> Cheers,
>> Niko
>>
>> ________________________________
>> From: Jarek Potiuk <ja...@potiuk.com>
>> Sent: Monday, July 7, 2025 11:15:16 AM
>> To: dev@airflow.apache.org
>> Subject: RE: [EXT] Discuss: AIP-67 (multi team) now that AIP-82 (External
>> event driven dags) exists
>>
>> CAUTION: This email originated from outside of the organization. Do not
>> click links or open attachments unless you can confirm the sender and know
>> the content is safe.
>>
>>
>>
>> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe.
>> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez
>> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que
>> le contenu ne présente aucun risque.
>>
>>
>>
>> I realized that I owe Niko an explanation of configuration changes. Again
>> -
>> following the philosophy above - minimal set of changes to "airflow
>> internals". the "minimum" set of changes that will work. I propose the
>> change below that has **no** changes to the way how the
>> current configuration "shared" feature works - it will change the way
>> executors will retrieve their configuration if they are configured
>> "per-team" - and we can 100% bank on existing multi-executors.
>> I believe that will absolutely minimise the set of changes needed to
>> implement multi-team and we will be able to get it "faster" and with "far
>> lower risk" of impacting airflow code and say - 3.1 or 3.2 delivery.
>>
>> Existing multi-executor configuration will be extended to include team
>> prefix. The prefix will be separated with ":", entries for different teams
>> will be separated with ";"
>>
>> [core]
>>
>> executor = team1:KubernetesExecutor,my.custom.module.Executor
>> Class;team2:CeleryExecutor
>>
>> The configuration of executors will also be prefixed with the same team:
>>
>> [team1:kubernetes_executor]
>>
>> api_client_retry_configuration = { "total": 3, "backoff_factor": 0.5 }
>>
>> The environment variables keeping configuration will use ___  (three
>> underscores) to replace ":". For example:
>>
>> AIRFLOW__TEAM1___KUBERNETES_EXECUTOR__API_CLIENT_RETRY_CONFIGURATION`
>>
>>
>> J.
>>
>> On Thu, Jul 3, 2025 at 8:47 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>> > > The direction this one is taking is interesting. If you're really just
>> > trying to make the feature barely possible and mostly targeted towards
>> > managed providers to implement the rest, then I suppose this hits the
>> mark.
>> >
>> > Well actually by taking the direction I took, it's not "mostly for
>> managed
>> > providers" - i see it as it is equally, for managed providers and
>> on-prem
>> > users, but also, following the open-source spirit, philosophically, I
>> think
>> > in Airflow, any such change should be done with those things in mind,
>> > because we are at the stage where we are already "established' and by
>> > innovating on top what we have we have sometimes more to lose than to
>> gain
>> > - so I feel with "deployment' features we should be very careful to
>> > distinguish 'enabling things" vs. 'doing things". My focus with this
>> > iteration was to remove all the roadblocks that make it impossible (or
>> > extremely difficult) to implement "real" multi-team and separation
>> without
>> > modifying airflow core. I though "what is the minimal set of features
>> that
>> > will make it "possible" for someone motivated to deploy a single airflow
>> > for multiple teams.
>> >
>> > * minimise maintenance effort increase
>> > * do not "spoil" the "simple case" - we do not want to add features that
>> > make "simple" implementation more complex the current `docker run -it
>> > apache/airflow standalone` - should be simple and straightforward to run
>> > * if there is anything that involves complex deployment, we should not
>> aim
>> > to make a "turn-key" solution that we will have to support - similarly
>> like
>> > we do with our configuration parameters, we have 100s knobs to turn,
>> and as
>> > long as default settings are reasonable and someone "motivated" can
>> > configure and fine-tune - this configuration and fine-tuning should be
>> left
>> > to them - regardless if they are on-prem or managed. And both should be
>> > able to do it.
>> >
>> > I think it's not only smart technically (we support the low-level basic
>> > features and when someone puts them together and makes it more of a
>> > turn-key solution they are responsible for designing and implementing
>> it -
>> > so we have less maintenance effort. But also it's good from a simple
>> > "open-source business model point of view" - i.e. it's a smart product
>> > decision we should make.
>> >
>> > Why airflow is #18 in OSS rank - of course we have a huge community and
>> > people contributing in their free time, completely voluntarily. And we
>> > cherish, support and encourage it. But let's be honest - if not all
>> those
>> > that make business on top of airflow did not invest literally millions
>> of
>> > dollars (in terms of engineering salary, sponsoring Airflow Summit,
>> > supporting people like me (some smart stakeholders at least who
>> understand
>> > the value of it) who can be good "community spirit" - Airflow would have
>> > order of magnitude less activity, reach, Airflow 3 would not be simply
>> > possible. And this is a good thing that we have those stakeholders that
>> are
>> > interested and make money by turning Airflow into a "turn-key" solution.
>> > This is a fantastic, symbiotic relationship.
>> >
>> > So - what my thinking is - we should NOT make things that make airflow
>> > more turn-key for those complex cases. We should leave it up to those
>> who
>> > want to make it and want to charge money for it. This is cool and great
>> > that they can do - and we should not do it "for them" - but on the other
>> > hand - we should make it possible that those who want to turn airflow
>> into
>> > more complex (say multi-team solution) to make it happen - by providing
>> > them with minimal set of features that make it possible.
>> >
>> > And that also - in a way - keeps the balance between on-prem and managed
>> > implementation.
>> >
>> > Something that I've learned as a rule of thumb is that making a feature
>> > "generic" compared to custom implementation is 3x-10x more expensive
>> (both
>> > in implementation and maintenance). And it means that if an on-prem user
>> > wants to implement something for them (say turn-key multi-team solution
>> for
>> > their case) it will cost `x` , but when a managed provider wants to
>> > implement a generic multi-team it will cost `10x`. But also managed
>> > providers can spread the cost over the premium they will charge to their
>> > users so that they don't have to manage Airflow on their own and pay `x`
>> > for this mult-team feature to develop on their own. And this is a "fair"
>> > choice to make by on-prem users. They might choose what they want to do
>> > then. Also it's fair for managed provider - yes they need to invest
>> more,
>> > but also they have a chance to shine on promoting it and making it more
>> > optimised at scale etc. etc.
>> >
>> > That is my line of thinking.
>> >
>> >
>> > J.
>> >
>> >
>> > On Thu, Jul 3, 2025 at 1:41 AM Oliveira, Niko
>> <oniko...@amazon.com.invalid>
>> > wrote:
>> >
>> >> Hey Jarek,
>> >>
>> >>
>> >> The direction this one is taking is interesting. If you're really just
>> >> trying to make the feature barely possible and mostly targeted towards
>> >> managed providers to implement the rest, then I suppose this hits the
>> mark.
>> >>
>> >> But this is not something we're asking for at Amazon and personally I
>> >> think we should make the feature reasonably usable for those running
>> >> self-managed OSS Airflow as well. There are many users running an
>> on-prem
>> >> Airflow. Getting too hyper-fixated on an implementation that's so
>> >> simplified that it's obtuse and difficult to use by most users seems
>> like
>> >> the wrong approach to me. But you and I have already discussed this at
>> >> length and I haven't convinced you so far, so if I'm the only one with
>> this
>> >> thinking then I'm happy to disagree and commit as we say at Amazon :)
>> >>
>> >>
>> >> > So I would be rather strong on **not** touching the current
>> >> configuration and
>> >>
>> >> simply adding configuration for per-team executors in executor config -
>> >> even if it is uglier and more "low-level".
>> >>
>> >> Can you explain what "adding configuration for per-team executors in
>> >> executor config" would look like? I don't have a concrete sense of
>> what you
>> >> mean by this.
>> >>
>> >> Thanks for your efforts on trying to get this feature agreed to and
>> voted
>> >> on. Looking forward to working on the project in the coming weeks!
>> >>
>> >> Cheers,
>> >> Niko
>> >>
>> >> ________________________________
>> >> From: Jarek Potiuk <ja...@potiuk.com>
>> >> Sent: Tuesday, July 1, 2025 10:26:55 PM
>> >> To: dev@airflow.apache.org
>> >> Subject: RE: [EXT] Discuss: AIP-67 (multi team) now that AIP-82
>> (External
>> >> event driven dags) exists
>> >>
>> >> CAUTION: This email originated from outside of the organization. Do not
>> >> click links or open attachments unless you can confirm the sender and
>> know
>> >> the content is safe.
>> >>
>> >>
>> >>
>> >> AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur
>> externe.
>> >> Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne
>> pouvez
>> >> pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain
>> que
>> >> le contenu ne présente aucun risque.
>> >>
>> >>
>> >>
>> >> Any last comments ? There is a long weekend coming up in the US, so I
>> will
>> >> likely start voting on the updated AIP on Monday 7th.
>> >>
>> >> On Fri, Jun 27, 2025 at 12:41 PM Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> >>
>> >> > I'd really love to finalise discussion and put it up to a vote some
>> time
>> >> > after the recording from the last dev call is posted - so that more
>> >> > context, details and the LONG discussion we had on it. There is no
>> >> *huge*
>> >> > hurry  - we have strong dependency on Task Isolation and it seems
>> that
>> >> it
>> >> > will still take a bit of time to complete, so I'd say I would love to
>> >> start
>> >> > voting in about a week time - so that maybe at the next dev call we
>> can
>> >> > "seal" the subject. Happy to see any more comments - especially from
>> >> those
>> >> > who have opinions but they had no opportunity to express them.
>> >> >
>> >> > I am personally very happy with the direction it took -
>> simplification
>> >> and
>> >> > "MVP" kind of approach - also I invite the stakeholders of ours to
>> take
>> >> a
>> >> > close look at the scope and what we really propose - I have a feeling
>> >> that
>> >> > we can balance it out - there is something we can make to make it not
>> >> > "worse" for the offerings they have. I think we have a really good
>> >> > symbiotic relationship here, and I would love to leverage that. For
>> one
>> >> -
>> >> > my goal here is to have a minimum number of changes that are
>> impacting
>> >> > maintainability of the open-source airflow - but mostly "opening up
>> some
>> >> > possibilities" - rather than provide turn-key solutions. And mostly
>> >> because
>> >> > this is good for all sides - less maintenance and complexity for OSS
>> >> > maintainers, but more opportunities to make it into "turn-key"
>> >> solutions by
>> >> > the stakeholders, while also allowing the "on-prem" users - if they
>> are
>> >> > highly motivated - to use those features by adding the "turn-key"
>> layer
>> >> on
>> >> > their own. Also adding multi-team should not be at the expense of
>> >> "simple"
>> >> > installations - they should be virtually unaffected.
>> >> >
>> >> > One example of applying this is cutting on "separate config files". I
>> >> > think it moves us closer to a "turn-key" solution but it is not
>> really
>> >> > necessary to achieve the three goals above - that's why in the
>> current
>> >> > proposal this part is completely removed - Sorry Niko, but I still
>> think
>> >> > it's one of the things that falls into this bucket. We can easily
>> remove
>> >> > it, they complicate code, documentation and options the users have,
>> and
>> >> > even if it is a "little" more complex to manage configuration by
>> >> motivated
>> >> > users, it's also an opportunity for "turn-key" option that
>> stakeholders
>> >> can
>> >> > build in their products - and we do not have to maintain it in the
>> >> > open-source. So I would be rather strong on **not** touching the
>> current
>> >> > configuration and simply adding configuration for per-team executors
>> in
>> >> > executor config - even if it is uglier and more "low-level".
>> >> >
>> >> > So if there are some constructive ideas on what can be done to make
>> it
>> >> > "simpler" and less "turn-key" in that respect - I would highly value
>> >> such
>> >> > ideas and comments. If we can cut down something more that is not
>> >> > "necessary" for the three primary goals I came up with - I am more
>> than
>> >> > happy to do it.
>> >> >
>> >> > Just to remind - those are the "extracted" goals. I slightly updated
>> >> them
>> >> > and added to the preamble of the AIP:
>> >> >
>> >> > * less operational overhead for managing multi-team (once AIP-72 is
>> >> > complete) where separate execution environments are important
>> >> > * virtual assets sharing between teams
>> >> > * ability of having "admin" and "team sharing" capability where dags
>> >> from
>> >> > multiple teams can be seen in a single Airflow UI  (requires custom
>> >> RBAC an
>> >> > AIP-56 implementation of Auth Manager - with KeyCloak Auth Manager
>> >> being a
>> >> > reference implementation)
>> >> >
>> >> > J.
>> >> >
>> >> >
>> >> > On Thu, Jun 26, 2025 at 10:53 AM Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> >> >
>> >> >>
>> >> >>> One technical observation: Now that the dag table no longer has a
>> >> >>> team_id in it, what would the behaviour be when a DAG is attempted
>> to
>> >> move
>> >> >>> between bundles? How do we detect this? (I’m not all convinced
>> that we
>> >> >>> correctly detect duplicate dag ids across bundles today, so I
>> wouldn’t
>> >> >>> assume or rely on the current behaviour.)
>> >> >>>
>> >> >>
>> >> >> Of course - yes, I realise that - that problem was also not handled
>> in
>> >> >> the previous iteration to be honest. That is something that dag
>> bundle
>> >> >> solution allows to solve eventually - but I do not think it's a
>> >> blocker for
>> >> >> the proposed implementation. We will have to eventually add some
>> way of
>> >> >> blocking dags to jump between bundles, we might tackle this
>> >> separately. I
>> >> >> already wanted to propose a separate update to that - but I did not
>> >> want to
>> >> >> complicate the current proposal. One thing at a time. I can,
>> however -
>> >> if
>> >> >> you consider that as a blocker, extend the current AIP with it. Not
>> a
>> >> big
>> >> >> problem. This is however a bit independent from the team_id
>> >> introduction.
>> >> >>
>> >> >> Overall, I am still unconvinced this proposal has enough real user
>> >> >>> benefit over actually separate deployments, and on balance of the
>> >> added
>> >> >>> complexity and maintenance burden I do not think it is worth it.
>> >> >>>
>> >> >>
>> >> >> That makes me sad, I thought that over the course of the discussion
>> I
>> >> >> addressed all the concerns (in this case the concern was "is it
>> worth
>> >> with
>> >> >> the cost and little benefit", but when I did it and heavily limited
>> the
>> >> >> impact, now the concern is "is it worth at all as changes are really
>> >> >> minimal" - and surely, anyone can change and adapt their concerns,
>> over
>> >> >> time,  but that one seems like ever-moving target. I hoped at least
>> for
>> >> >> some acknowledgment of some concerns (complexity in this case) is
>> >> >> addressed, but it seems that you are deeply convinced that we do not
>> >> need
>> >> >> multi-team at all (which is in stark contrast with at least a dozen
>> of
>> >> >> bigger and smaller users of Airflow who submitted talks to Airflow
>> >> summit
>> >> >> (including about 5 or 6 submissions for Airflow 2025) on how they
>> spent
>> >> >> their engineering effort, time and money on trying to achieve
>> something
>> >> >> similar - they assessed that it's worth, you  assess that it's not
>> >> worth.
>> >> >> Somehow I trust our users that they were not spending the money,
>> time
>> >> and
>> >> >> engineering effort to achieve this because they wanted to spend more
>> >> money.
>> >> >> I think they assessed it's worth it. So I want to make it a bit
>> easier
>> >> and
>> >> >> more "proper" way for them to do that.
>> >> >>
>> >> >>>
>> >> >>> Upgrades: it is not easier to upgrade under this multi team
>> proposal,
>> >> >>> but much much harder. This is based on hard earned experience from
>> >> helping
>> >> >>> Astronomer users — having to coordinate upgrades between multiple
>> >> teams
>> >> >>> turns in to a months long slog of the hardest kind of work —
>> people
>> >> work:
>> >> >>> getting other teams to agree to do things that they don’t directly
>> >> care
>> >> >>> about — “It’s working for me, I don’t care about upgrading, we’ll
>> get
>> >> to it
>> >> >>> next quarter” is a refrain I’ve heard many times.
>> >> >>>
>> >> >>
>> >> >> Yes. absolutely - this is why we deferred it until we knew what
>> shape
>> >> >> task isolation and other AIPs we depend on take on. Because it is
>> clear
>> >> >> that pretty much all the problem you explain above are going to be
>> >> solved
>> >> >> with task isolation. And it's not just my opinion. If you want to
>> argue
>> >> >> with it, you likely need to argue with yourself:
>> >> >>
>> https://github.com/apache/airflow/issues/51545#issuecomment-2980038478
>> >> .
>> >> >> Let me quote what you wrote there last week:
>> >> >>
>> >> >> Ash Berlin Taylor wrote:
>> >> >>
>> >> >> > A tight coupling between task-sdk and any "server side" component
>> is
>> >> >> the opposite to one of the goals of AIP-72 (I'm not sure we ever
>> >> explicitly
>> >> >> said this, but the first point of motivation for the AIP says
>> >> "Dependency
>> >> >> conflicts for administrators supporting data teams using different
>> >> versions
>> >> >> of providers, libraries, or python packages")
>> >> >> > In short, my goal with TaskSDK, and the reason for introducing
>> CalVer
>> >> >> and Cadwyn with the execution API is to end up in a world where you
>> can
>> >> >> upgrade the Airflow Scheduler/API server interdependently of any
>> worker
>> >> >> nodes (with the exception that the server must be at least as new as
>> >> the
>> >> >> clients)
>> >> >> > This ability to have version-skew is pretty much non-negotiable
>> to me
>> >> >> and is (other than other languages) one of primary benefits of
>> AIP-72
>> >> >>
>> >> >> If you read yourself from that quote it basically means "it will be
>> >> easy
>> >> >> to upgrade airflow independently of workers". So I am a bit confused
>> >> here.
>> >> >> Yes, I agree it was difficult, but you yourself explain that when
>> >> AIP-72
>> >> >> (which since API-67 has been accepted has always beem prerequisite
>> of
>> >> it)
>> >> >> wrote it will be "easy". So I am not sure why you are bringing it
>> now.
>> >> We
>> >> >> assume AIP-72 will be completed and this problem will be gone. Let's
>> >> not
>> >> >> mention it any more please.
>> >> >>
>> >> >> The true separation from TaskSDK will likely only land in about 3.2
>> >> time
>> >> >>> frame. We are actively working on it, but it’s a slow process of
>> >> untangling
>> >> >>> lots of assumptions made in the code base over the years. Maybe
>> once
>> >> we
>> >> >>> have that my view would be different, but right now I think this
>> >> makes the
>> >> >>> proposal a non-starter. Especially as you are saying that most
>> teams
>> >> will
>> >> >>> have unique connections. If they’ve got those already, then having
>> an
>> >> asset
>> >> >>> trigger use those conns to watch/poll for activity is a much easier
>> >> >>> solution to operate and crucially, to scale and upgrade.
>> >> >>>
>> >> >>
>> >> >> Yes. I perfectly understand that and I am fully aware of potentially
>> >> 3.2
>> >> >> time-frame. And that's fine. Actually I heartily invite you to
>> listen
>> >> to
>> >> >> the part of my talk from Berlin Buzzwords when I was asked for the
>> >> timeline
>> >> >> - https://youtu.be/EyhZOnbwc-4?t=2226 - this link leads to the
>> exact
>> >> >> timeline in my talk . My answer was basically - "3.1" or "3.2", and
>> I
>> >> >> sincerely hope "3.1" but we might not be able to complete it
>> because we
>> >> >> have other things to do (other - is indeed the Task Isolation work
>> >> that you
>> >> >> are leading). And that's perfectly fine. And it absolutely does not
>> >> prevent
>> >> >> us from voting on the AIP now - similarly as we voted on the
>> previous
>> >> >> version of the AIP - knowing that it has some prerequisites a few
>> >> months
>> >> >> ago. Especially that we know that the feature we need from task
>> >> isolation
>> >> >> is "non-negotiable". I.e. it WILL happen. We don't hope for it, we
>> >> know it
>> >> >> will be there. Those are your own words.
>> >> >>
>> >> >>
>> >> >>> >  I think we can’t compare AIP-82 to sharing virtual assets due to
>> >> >>> complexity of it.
>> >> >>>
>> >> >>> Virtual Assets was a mistake, and not how users actually want to
>> use
>> >> >>> them. Mea culpa
>> >> >>>
>> >> >>
>> >> >> This is the first time I hear this - certainly you never raised this
>> >> >> concern on the devlist. So if you have some concerns about virtual
>> >> assets I
>> >> >> think you should raise it on the devlist, because I think everyone
>> >> here is
>> >> >> missing some conversation (or maybe it's just your private opinion
>> that
>> >> >> you never shared with anyone, but maybe it's worth). I would be
>> >> >> interested to hear how the feature that was absolutely most
>> successful
>> >> >> feature of airflow 2 was a mistake. According to the 2024 survey
>> >> >> https://airflow.apache.org/blog/airflow-survey-2024/  - 48% of
>> Airflow
>> >> >> users have been using it, even if it was added as one of the last
>> >> >> big features of Airflow 2. It's the MOST used feature out of all the
>> >> >> features out there. I would be really curious to see how it was a
>> >> mistake
>> >> >> (but  please start a separate thread explaining why you think it
>> was a
>> >> >> mistake, what are your data points and what do you think should be
>> >> fixed.
>> >> >> Just dropping "virtual assets were a mistake" in the middle of
>> >> multi-team
>> >> >> conversation seems completely unjustified without knowing what you
>> are
>> >> >> talking about. So I think, until we know more, this argument has no
>> >> base.
>> >> >>
>> >> >>
>> >> >>>
>> >> >>> S
>> >> >>> To restate my points:
>> >> >>>
>> >> >>> - Sharing a deployment between teams today/in 3.1 is operationally
>> >> more
>> >> >>> complex (both scaling, and upgrades) — this is a con, not a plus.
>> >> >>>
>> >> >>
>> >> >> Surely. But it will be easier when AIP-72 is complete (which I am
>> >> >> definitely looking forward to and as clearly explained in AIP-82,
>> is a
>> >> >> prerequisite of it). Nothing changed here.
>> >> >>
>> >> >>
>> >> >>> - The main user benefit appears to be “allow teams’ DAGs to
>> >> communicate
>> >> >>> via Assets”, in which case we can do that today by putting more
>> work
>> >> in to
>> >> >>> AIP-82’s Asset triggers
>> >> >>>
>> >> >>
>> >> >> No. Lower operational complexity for multi-teams (providing that we
>> >> >> deliver AIP-72) is another benefit. Virtual assets is another, and
>> >> since
>> >> >> there is no ground in "virtual assets is a mistake" statement (not
>> >> until
>> >> >> you explain what you mean by that in a separate discussion) - this
>> is
>> >> also
>> >> >> still a very valid point.
>> >> >>
>> >> >>
>> >> >>> Soon, we will have then be asked about cross-team governance,
>> policy
>> >> >>> enforcement, and potentially unbounded edge cases (e.g.,
>> team-specific
>> >> >>> secrets, roles, quotas). ain, you get this for free with truely
>> >> separate
>> >> >>> deployments already
>> >> >>> allow different teams to use different executors (including
>> multiple
>> >> >>> executors per-team following AIP-61)
>> >> >>>
>> >> >>
>> >> >> Not really. We very explicitly say in the AIP that his is not a goal
>> >> and
>> >> >> that we have no plans for. And yes, using separate executors per
>> team
>> >> is
>> >> >> actually back in the AIP-82 in case you did not notice (and the code
>> >> needed
>> >> >> for it's even implemented and merged already in main by Vincent).
>> >> >>
>> >> >>
>> >> >>> Provably not true right now, and until ~3.2 delivers the full Task
>> >> >>> SDK/Core dependency separation this would be _more_ work to
>> upgrade,
>> >> not
>> >> >>> less, and that work is not shared but still on a central team.
>> >> >>>
>> >> >>
>> >> >> Absolutely - we will wait for AIP-72 completion. I do not want to
>> say
>> >> 3.1
>> >> >> or 3.2 directly - because there are - as you said - a lot of moving
>> >> pieces.
>> >> >> So my target for multi-team is "After AIP-72 is completed". Full
>> stop.
>> >> But
>> >> >> there is nothing wrong with accepting the AIP now and doing
>> preparatory
>> >> >> work in parallel. Similarly as there is no way to have a baby in 1
>> >> month by
>> >> >> 9 women, there is no way adding more effort to task-sdk isolation
>> will
>> >> >> speed it up - we alredy have not only 3 people (you leading it,
>> Kaxil
>> >> and
>> >> >> Amog) but also all the help from me and even 10s of different
>> >> contributors
>> >> >> (for example with the recent db_test cleanup that I took leadership
>> >> on) -
>> >> >> and there are people who wish to work on adding multi-team features.
>> >> Since
>> >> >> the design heavily limits impact on airflow codebase and
>> interactions
>> >> with
>> >> >> task-sdk implementation, there is nothing wrong with starting
>> >> >> implementation in parallel either- amazon team is keen to move it
>> >> forward -
>> >> >> they even already implemented SQS trigger for assets, and we are
>> >> working
>> >> >> together on FAB removal, Keycloak authentication manager - and they
>> >> seem to
>> >> >> still have capacity and drive to progress multi-team. So I am not
>> sure
>> >> if
>> >> >> we are trading off something. There is no "if we work on more on
>> task
>> >> sdk
>> >> >> and drop multi-team things will be faster". Generally in open source
>> >> people
>> >> >> work in the area where they feel they can provide best value - such
>> as
>> >> you
>> >> >> working on task-sdk, me on CI,dev env, they will deliver more value
>> on
>> >> >> multi-team
>> >> >>
>> >> >>
>> >> >>>
>> >> >>> So please, as succinctly as possible, please tell me what the
>> direct
>> >> >>> benefit to users this proposal is over us putting this effort in to
>> >> writing
>> >> >>> better Asset triggers instead?
>> >> >>>
>> >> >>
>> >> >>
>> >> >> * less operational overhead for managing multi-team (once AIP-72 is
>> >> >> complete) where separate execution environments are important
>> >> >> * virtual assets sharing
>> >> >> * ability of having "admin" and "team sharing" capability where dags
>> >> from
>> >> >> multiple teams can be seen in a single Airflow UI  (requires custom
>> >> RBAC)
>> >> >>
>> >> >> None of this can be done via beter asset triggers
>> >> >>
>> >> >>
>> >> >>>
>> >> >>> > On 23 Jun 2025, at 10:57, Jarek Potiuk <ja...@potiuk.com> wrote:
>> >> >>> >
>> >> >>> > My counter-points:
>> >> >>> >
>> >> >>> >
>> >> >>> >> 1. Managing a multi team deployment is not materially different
>> >> from
>> >> >>> >> managing a deployment per team
>> >> >>> >>
>> >> >>> >
>> >> >>> > It's a bit easier - especially when it comes to upgrades
>> (especially
>> >> >>> in the
>> >> >>> > case we are targetting when we are not targetting multi-tenant,
>> but
>> >> >>> several
>> >> >>> > relatively closely cooperating teams with different dependncy
>> >> >>> requiremens
>> >> >>> > and isolation need.
>> >> >>> >
>> >> >>> > 2. The database changes were quite wide-reaching
>> >> >>> >>
>> >> >>> >
>> >> >>> > Yes. that is addressed.
>> >> >>> >
>> >> >>> >
>> >> >>> >> 3. I don’t believe the original AIP (again, I haven’t read the
>> >> updated
>> >> >>> >> proposal or recent messages on the thread. yet) will meet what
>> many
>> >> >>> users
>> >> >>> >> want out of a multiteam solution
>> >> >>> >>
>> >> >>> >
>> >> >>> > I think we will only see when we try. A lot of people thing they
>> >> would,
>> >> >>> > even if they are warned. I know at least one user (Wealthsimple)
>> who
>> >> >>> > definitely want to use it and they got a very detailed
>> explanation
>> >> of
>> >> >>> the
>> >> >>> > idea and understand it well. So I am sure that **some** users
>> would.
>> >> >>> But we
>> >> >>> > do not know how many.
>> >> >>> >
>> >> >>> >
>> >> >>> >> To expand on those points a bit more
>> >> >>> >>
>> >> >>> >> On 1. The only components that are shared are, I think, the
>> >> scheduler
>> >> >>> and
>> >> >>> >> the API server, and it’s arguable if that is actually a good
>> idea
>> >> >>> given
>> >> >>> >> those are likely to be the most performance sensitive components
>> >> >>> anyway.
>> >> >>> >>
>> >> >>> >> Additionally the fact that the scheduler is a shared component
>> >> makes
>> >> >>> >> upgrading it almost a non starter as you would likely need
>> buy-in,
>> >> >>> changes,
>> >> >>> >> and testing form ALL teams using it. I’d argue that this is a
>> huge
>> >> >>> negative
>> >> >>> >> until we finish off the version indepence work of AIP-72.
>> >> >>> >>
>> >> >>> >
>> >> >>> > Quite disagree here - especially that our target is that
>> task-sdk is
>> >> >>> > supposed to provide all isolation that is needed. There should
>> be 0
>> >> >>> changes
>> >> >>> > in the dags needed to upgrade scheduler, api_server, triggerer -
>> >> >>> precisely
>> >> >>> > because we introduced backwards-compatible task-sdk.
>> >> >>> >
>> >> >>> > On 3 my complaint is essentially that this doesn’t go nearly far
>> >> >>> enough. It
>> >> >>> >> doesn’t allow read only views to other teams dags. I don’t
>> think it
>> >> >>> allows
>> >> >>> >> you to be in multiple teams at once. You can’t share a
>> connection
>> >> >>> between
>> >> >>> >> teams but only allow certain specified dags to access it, but
>> would
>> >> >>> have to
>> >> >>> >> either be globally usable, or duplicated-and-kept-in-sync
>> between
>> >> >>> teams. In
>> >> >>> >> short I think it fall short of being useful..
>> >> >>> >>
>> >> >>> >
>> >> >>> > Oh absolutely all that is possible (except sharing single
>> >> connections
>> >> >>> > between multiple teams - which is a very niche use cases and
>> >> >>> duplication
>> >> >>> > here is perfectly ok as first approximation - and if we need
>> more we
>> >> >>> can
>> >> >>> > add it later).
>> >> >>> >
>> >> >>> > Auth manager RBAC and access is abstracted away, and the Keyclock
>> >> >>> Manager
>> >> >>> > implemented by Vincent allows to manage completely independent
>> and
>> >> >>> separate
>> >> >>> > RBAC based on arguments and resources provided by Airflow. There
>> is
>> >> >>> nothing
>> >> >>> > to prevent the user who configures KeyCloak RBAC to define it in
>> the
>> >> >>> way:
>> >> >>> >
>> >> >>> > if group a > allow to read a and write b
>> >> >>> > if group b > alllow to write b but not a
>> >> >>> >
>> >> >>> > and any other combinations. KeyCloak implementation - pretty
>> >> advanced
>> >> >>> > already - (and design of auth manager) completely abstracts away
>> >> both
>> >> >>> > authentication and authorization to KeyCloak and KeyCloak has
>> RBAC
>> >> >>> > management built in. Also any of the users can write their own -
>> >> even
>> >> >>> > hard-coded authentication manager to do the same if they do not
>> >> want to
>> >> >>> > have configurable KeyCloak. Even SimpleAuthManager could be
>> >> hard-coded
>> >> >>> to
>> >> >>> > provide thiose features.
>> >> >>> >
>> >> >>> >
>> >> >>> >>
>> >> >>> >> So on the surface, I’m no more in favour of using dag bundle as
>> a
>> >> >>> >> replacement for team id as I think most of the above points
>> still
>> >> >>> stand.
>> >> >>> >>
>> >> >>> >
>> >> >>> > We disagree here.
>> >> >>> >
>> >> >>> >>
>> >> >>> >> My counter proposal: We do _nothing_ to core airflow. We work on
>> >> >>> improving
>> >> >>> >> the event-based trigger o fdags (write more triggers for
>> read/check
>> >> >>> remote
>> >> >>> >> Assets etc) so that teams can have 100% isolated deployments but
>> >> still
>> >> >>> >> trigger dags based on asset events from other teams.
>> >> >>> >>
>> >> >>> >
>> >> >>> > That does not solve any of the other design goals - only allows
>> to
>> >> >>> trigger
>> >> >>> > assets a bit more easily (but also it's not entirely solved by
>> >> AIP-82
>> >> >>> > because it does not solve virtual assets - only ones that have
>> >> defined
>> >> >>> > triggerer and "something" to listen on - which is way more
>> complex
>> >> than
>> >> >>> > just defining asset in a Dag and using it in another). I think we
>> >> can't
>> >> >>> > compare AIP-82 to sharing virtual assets due to complexity of
>> it. I
>> >> >>> > explained it in the doc.
>> >> >>> >
>> >> >>> >
>> >> >>> > I will now go and catch up with the long thread and updated
>> proposal
>> >> >>> and
>> >> >>> >> come back.
>> >> >>> >>
>> >> >>> >
>> >> >>> > Please. I hope the above explaination will help in better
>> >> >>> understanding of
>> >> >>> > the proposal, because I think you had some assumptions that do
>> not
>> >> >>> hold any
>> >> >>> > more with the new proposal.
>> >> >>> >
>> >> >>> > J.
>> >> >>> >
>> >> >>> >
>> >> >>> >>
>> >> >>> >>> On 23 Jun 2025, at 05:54, Jarek Potiuk <ja...@potiuk.com>
>> wrote:
>> >> >>> >>>
>> >> >>> >>> Just to clarify the relation - I updated the AIP now to refer
>> to
>> >> >>> AIP-82
>> >> >>> >> and
>> >> >>> >>> to explain relation between the "cross-team" and
>> "cross-airflow"
>> >> >>> asset
>> >> >>> >>> triggering - this is what I added:
>> >> >>> >>>
>> >> >>> >>> Note that there is a relation between AIP-82 ("External Driven
>> >> >>> >> Scheduling")
>> >> >>> >>> and this part of the functionality. When you have multiple
>> >> instances
>> >> >>> of
>> >> >>> >>> Airflow, you can use shared datasets - "Physical datasets" -
>> that
>> >> >>> several
>> >> >>> >>> Airflow Instances can use - for example there could be an S3
>> >> object
>> >> >>> that
>> >> >>> >> is
>> >> >>> >>> produced by one airflow instance, and consumed by another. That
>> >> >>> requires
>> >> >>> >>> deferred trigger to monitor for such datasets, and appropriate
>> >> >>> >> permissions
>> >> >>> >>> to the external dataset, and you could achive similar result to
>> >> >>> >> cross-team
>> >> >>> >>> dataset triggering (but cross airflow). However the feature of
>> >> >>> sharing
>> >> >>> >>> datasets between the teams also works for virtual assets, that
>> do
>> >> not
>> >> >>> >> have
>> >> >>> >>> physically shared "objects" and trigger that is monitoring for
>> >> >>> changes in
>> >> >>> >>> such asset.
>> >> >>> >>>
>> >> >>> >>> J.
>> >> >>> >>>
>> >> >>> >>>
>> >> >>> >>> On Mon, Jun 23, 2025 at 6:38 AM Jarek Potiuk <ja...@potiuk.com
>> >
>> >> >>> wrote:
>> >> >>> >>>
>> >> >>> >>>>> From a quick glance, the updated AIP didn't seem to have any
>> >> >>> reference
>> >> >>> >> to
>> >> >>> >>>>> AIP-82, which surprised me, but will take a more detailed
>> read
>> >> >>> through.
>> >> >>> >>>>
>> >> >>> >>>> Yep. It did not - because I did not think it was needed or
>> even
>> >> very
>> >> >>> >>>> important after the simplifications. AIP-82 has a different
>> >> scope,
>> >> >>> >> really.
>> >> >>> >>>> It only helps when the Assets are "real" data files which we
>> have
>> >> >>> >> physical
>> >> >>> >>>> triggers for, it's slightly related - sharing datasets between
>> >> teams
>> >> >>> >>>> (including those that do not require physical files and
>> >> triggers) is
>> >> >>> >> still
>> >> >>> >>>> possible in the design we have now, but it's not (and never
>> was)
>> >> the
>> >> >>> >>>> **only** reason for having multi-team. There always was (and
>> >> still
>> >> >>> is)
>> >> >>> >> the
>> >> >>> >>>> possibility of having a common, distinct environments (i.e.
>> >> >>> dependencies
>> >> >>> >>>> and providers) per team, the possibility of having connections
>> >> and
>> >> >>> >>>> variables that are only accessible to one team and not the
>> other,
>> >> >>> and
>> >> >>> >>>> isolating workload execution (all that while allowing to
>> manage
>> >> >>> multiple
>> >> >>> >>>> team and schedule things with single deployment). That did not
>> >> >>> change.
>> >> >>> >> What
>> >> >>> >>>> changed a lot is that it is now way simpler, something that we
>> >> can
>> >> >>> >>>> implement without heavy changes to the codebase - and give it
>> to
>> >> our
>> >> >>> >> users,
>> >> >>> >>>> so that they can assess if this is something they need without
>> >> too
>> >> >>> much
>> >> >>> >>>> risk and effort.
>> >> >>> >>>>
>> >> >>> >>>> This was - I believe the main concern, that the value we get
>> from
>> >> >>> it is
>> >> >>> >>>> not dramatic, but the required changes are huge. This
>> "redesign"
>> >> >>> changes
>> >> >>> >>>> the equation - the value is still unchanged, but the cost of
>> >> >>> >> implementing
>> >> >>> >>>> it and impact on the Airflow codebase is much smaller. I still
>> >> have
>> >> >>> not
>> >> >>> >>>> heard back from Ash if my proposal responds to his original
>> >> concern
>> >> >>> >> though,
>> >> >>> >>>> so I am mostly guessing (also based on the positive impact of
>> >> >>> others)
>> >> >>> >> that
>> >> >>> >>>> yes it does. But to be honest I am not sure and I would love
>> to
>> >> hear
>> >> >>> >> back,
>> >> >>> >>>> I decided to update the AIP to reflect it - regardless,
>> because I
>> >> >>> think
>> >> >>> >> the
>> >> >>> >>>> simplification I proposed keeps the original goals, but is
>> indeed
>> >> >>> way
>> >> >>> >>>> simpler.
>> >> >>> >>>>
>> >> >>> >>>>> This is a very difficult thread to catch up on.
>> >> >>> >>>>
>> >> >>> >>>> Valid point. Let me summarize what is the result:
>> >> >>> >>>>
>> >> >>> >>>> * I significantly simplified the implementation proposal
>> >> comparing
>> >> >>> to
>> >> >>> >> the
>> >> >>> >>>> original version
>> >> >>> >>>> * main simplification is very limited impact on existing
>> >> database -
>> >> >>> >>>> without "ripple effect" that would require us to change a lot
>> of
>> >> >>> tables,
>> >> >>> >>>> including their primary keys, and heavily impact the UI
>> >> >>> >>>> * this is now more of an incremental change that can be
>> >> implemented
>> >> >>> way
>> >> >>> >>>> faster and with far less risk
>> >> >>> >>>> * updated idea is based on leveraging bundles (already part of
>> >> our
>> >> >>> data
>> >> >>> >>>> model) to map them (many-to-one) to a team - which requires to
>> >> just
>> >> >>> >> extend
>> >> >>> >>>> the data model with bundle mapping and add team_id to
>> connections
>> >> >>> and
>> >> >>> >>>> variables. Those are all needed DB changes.
>> >> >>> >>>>
>> >> >>> >>>> The AIP is updated - in a one single big change so It should
>> be
>> >> >>> easy to
>> >> >>> >>>> compare the changes:
>> >> >>> >>>>
>> >> >>> >>
>> >> >>>
>> >>
>> https://cwiki.apache.org/confluence/pages/viewpreviousversions.action?pageId=294816378
>> >> >>> >>>> -> I even named the version appropriately "Simplified
>> multi-team
>> >> >>> AIP" -
>> >> >>> >> you
>> >> >>> >>>> can select and compare v.65 with v.66 to see the exact
>> >> differences I
>> >> >>> >>>> proposed.
>> >> >>> >>>>
>> >> >>> >>>> I hope it will be helpful to catch up and for those who did
>> not
>> >> >>> follow,
>> >> >>> >> to
>> >> >>> >>>> be able to make up their minds about it.
>> >> >>> >>>>
>> >> >>> >>>> J.
>> >> >>> >>>>
>> >> >>> >>>>
>> >> >>> >>>>
>> >> >>> >>>> On Mon, Jun 23, 2025 at 4:35 AM Vikram Koka
>> >> >>> >> <vik...@astronomer.io.invalid>
>> >> >>> >>>> wrote:
>> >> >>> >>>>
>> >> >>> >>>>> This is a very difficult thread to catch up on.
>> >> >>> >>>>> I will take a detailed look at the AIP update to try to
>> figure
>> >> out
>> >> >>> the
>> >> >>> >>>>> changes in the proposal.
>> >> >>> >>>>>
>> >> >>> >>>>> From a quick glance, the updated AIP didn't seem to have any
>> >> >>> reference
>> >> >>> >> to
>> >> >>> >>>>> AIP-82, which surprised me, but will take a more detailed
>> read
>> >> >>> through.
>> >> >>> >>>>>
>> >> >>> >>>>> Vikram
>> >> >>> >>>>>
>> >> >>> >>>>>
>> >> >>> >>>>>
>> >> >>> >>>>> On Sun, Jun 22, 2025 at 1:44 AM Pavankumar Gopidesu <
>> >> >>> >>>>> gopidesupa...@gmail.com>
>> >> >>> >>>>> wrote:
>> >> >>> >>>>>
>> >> >>> >>>>>> Thanks Jarek, that's a great update on this AIP, now it's
>> much
>> >> >>> more
>> >> >>> >> slim
>> >> >>> >>>>>> down.
>> >> >>> >>>>>>
>> >> >>> >>>>>> left a minor comment. :) Overall looking great.
>> >> >>> >>>>>>
>> >> >>> >>>>>> Pavan
>> >> >>> >>>>>>
>> >> >>> >>>>>> On Sat, Jun 21, 2025 at 3:10 PM Jens Scheffler
>> >> >>> >>>>> <j_scheff...@gmx.de.invalid
>> >> >>> >>>>>>>
>> >> >>> >>>>>> wrote:
>> >> >>> >>>>>>
>> >> >>> >>>>>>> Thanks for the rework/update of the AIP-72!
>> >> >>> >>>>>>>
>> >> >>> >>>>>>> Just a few small comments but overall I like it as it is
>> much
>> >> >>> leaner
>> >> >>> >>>>>>> than originally planned and is in a level of complexity
>> that
>> >> it
>> >> >>> >> really
>> >> >>> >>>>>>> seems to be a benefit to close the gap as described.
>> >> >>> >>>>>>>
>> >> >>> >>>>>>> On 21.06.25 14:52, Jarek Potiuk wrote:
>> >> >>> >>>>>>>> I updated the AIP - including architecture images and
>> >> reviewed
>> >> >>> it
>> >> >>> >>>>>> (again)
>> >> >>> >>>>>>>> and corrected any ambiguities and places where it needed
>> to
>> >> be
>> >> >>> >>>>> changed.
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> I think the current state
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>
>> >> >>> >>>>>>
>> >> >>> >>>>>
>> >> >>> >>
>> >> >>>
>> >>
>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components
>> >> >>> >>>>>>>> - nicely describes the proposal.
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> Comparing to the previous one:
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> 1. The DB changes are far less intrusive - no ripple
>> effect
>> >> on
>> >> >>> >>>>> Airflow
>> >> >>> >>>>>>>> 2. There is no need to merge configurations and provide
>> >> >>> different
>> >> >>> >>>>> set
>> >> >>> >>>>>> of
>> >> >>> >>>>>>>> configs per team - we can add it later but I do not see
>> why
>> >> we
>> >> >>> need
>> >> >>> >>>>> it
>> >> >>> >>>>>> in
>> >> >>> >>>>>>>> this simplified version
>> >> >>> >>>>>>>> 3. We can still configure a different set of executors per
>> >> team
>> >> >>> -
>> >> >>> >>>>> that
>> >> >>> >>>>>> is
>> >> >>> >>>>>>>> already implemented (we just need to wire it to the
>> bundle ->
>> >> >>> team
>> >> >>> >>>>>>> mapping).
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> I think it will be way simpler and faster to implement
>> this
>> >> way
>> >> >>> and
>> >> >>> >>>>> it
>> >> >>> >>>>>>>> should serve as MVMT -> Minimum Viable Multi Team that we
>> can
>> >> >>> give
>> >> >>> >>>>> our
>> >> >>> >>>>>>>> users so that they can provide feedback.
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> J.
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>> On Fri, Jun 20, 2025 at 8:33 AM Jarek Potiuk <
>> >> ja...@potiuk.com>
>> >> >>> >>>>> wrote:
>> >> >>> >>>>>>>>
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>>> I like this iteration a bit more now for sure, thanks
>> for
>> >> >>> being
>> >> >>> >>>>>>> receptive
>> >> >>> >>>>>>>>>> to feedback! :)
>> >> >>> >>>>>>>>>>
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>>> This now becomes quite close to what was proposing
>> before,
>> >> we
>> >> >>> now
>> >> >>> >>>>>> again
>> >> >>> >>>>>>>>>> have a team ID (which I think is really needed here,
>> glad
>> >> to
>> >> >>> see
>> >> >>> >>>>> it
>> >> >>> >>>>>>> back)
>> >> >>> >>>>>>>>>> and it will be used for auth management, configuration
>> >> >>> >>>>> specification,
>> >> >>> >>>>>>> etc
>> >> >>> >>>>>>>>>> but will be carried by Bundle instead of the dag model.
>> >> Which
>> >> >>> as
>> >> >>> >>>>> you
>> >> >>> >>>>>>> say
>> >> >>> >>>>>>>>>> “For that we will need to make sure that both
>> api-server,
>> >> >>> >>>>> scheduler
>> >> >>> >>>>>> and
>> >> >>> >>>>>>>>>> triggerer have access to the "bundle definition" (to
>> >> perform
>> >> >>> the
>> >> >>> >>>>>>> mapping)"
>> >> >>> >>>>>>>>>> which honestly doesn’t feel too much different from the
>> >> >>> original
>> >> >>> >>>>>>> proposal
>> >> >>> >>>>>>>>>> we had last week of adding it to Dag table and ensuring
>> >> it’s
>> >> >>> >>>>>> available
>> >> >>> >>>>>>>>>> everywhere. but either way I’m happy to meet in the
>> middle
>> >> and
>> >> >>> >>>>> keep
>> >> >>> >>>>>> it
>> >> >>> >>>>>>> on
>> >> >>> >>>>>>>>>> Bundle if everyone else feels that’s a more suitable
>> >> location.
>> >> >>> >>>>>>>>>>
>> >> >>> >>>>>>>>> I think the big difference is the "ripple effect" that
>> was
>> >> >>> >>>>> discussed
>> >> >>> >>>>>> in
>> >> >>> >>>>>>>>>
>> >> >>> https://lists.apache.org/thread/78vndnybgpp705j6sm77l1t6xbrtnt5c
>> >> >>> >>>>>> (and I
>> >> >>> >>>>>>>>> believe - correct me if I am wrong Ash - important
>> trigger
>> >> for
>> >> >>> the
>> >> >>> >>>>>>>>> discussion) so far what we wanted is to extend the
>> primary
>> >> key
>> >> >>> and
>> >> >>> >>>>> it
>> >> >>> >>>>>>> would
>> >> >>> >>>>>>>>> ripple through all the pieces of Airflow -> models, API,
>> UI
>> >> >>> etc.
>> >> >>> >>>>> ...
>> >> >>> >>>>>>>>> However - we already have `bundle_name" and
>> "bundle_version"
>> >> >>> in the
>> >> >>> >>>>>> Dag
>> >> >>> >>>>>>>>> model. So I think when we add a separate table where we
>> map
>> >> the
>> >> >>> >>>>> bundle
>> >> >>> >>>>>>> to
>> >> >>> >>>>>>>>> the team, the "ripple effect" will be almost 0. We do not
>> >> want
>> >> >>> to
>> >> >>> >>>>>> change
>> >> >>> >>>>>>>>> primary key, we do not want to change UI in any way
>> (except
>> >> >>> >>>>> filtering
>> >> >>> >>>>>> of
>> >> >>> >>>>>>>>> DAGs available based on your team - but that will be
>> >> handled in
>> >> >>> >>>>> Auth
>> >> >>> >>>>>>>>> Manager and will not impact UI in any way, I think
>> that's a
>> >> >>> huge
>> >> >>> >>>>>>>>> simplification of the implementation, and if we agree to
>> it
>> >> - i
>> >> >>> >>>>> think
>> >> >>> >>>>>> it
>> >> >>> >>>>>>>>> should speed up the implementation significantly. There
>> are
>> >> >>> only a
>> >> >>> >>>>>>> limited
>> >> >>> >>>>>>>>> number of times where you need to look up the team_id -
>> so
>> >> >>> having
>> >> >>> >>>>> the
>> >> >>> >>>>>>>>> bundle -> team mapping in a separate table and having to
>> >> look
>> >> >>> them
>> >> >>> >>>>> up
>> >> >>> >>>>>>>>> should not be a problem. And it has much less complexity
>> and
>> >> >>> >>>>>>>>> "ripple-effect" through the codebase (for example I could
>> >> >>> imagine
>> >> >>> >>>>> 100s
>> >> >>> >>>>>>> or
>> >> >>> >>>>>>>>> thousands already written tests that would have to be
>> >> adapted
>> >> >>> if we
>> >> >>> >>>>>>> changed
>> >> >>> >>>>>>>>> the primary key - where there will be pretty much zero
>> >> impact
>> >> >>> on
>> >> >>> >>>>>>> existing
>> >> >>> >>>>>>>>> tests if we just add bundle -> team lookup table.
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>>> One other thing I’d point out is that I think including
>> >> >>> executors
>> >> >>> >>>>> per
>> >> >>> >>>>>>>>>> team is a very easy win and quite possible without much
>> >> work.
>> >> >>> I
>> >> >>> >>>>>> already
>> >> >>> >>>>>>>>>> have much of the code written. Executors are already
>> aware
>> >> of
>> >> >>> >>>>> Teams
>> >> >>> >>>>>>> that
>> >> >>> >>>>>>>>>> own them (merged), I have a PR open to have
>> configuration
>> >> per
>> >> >>> team
>> >> >>> >>>>>>> (with a
>> >> >>> >>>>>>>>>> quite simple and isolated approach, I believe you
>> approved
>> >> >>> Jarek).
>> >> >>> >>>>>> The
>> >> >>> >>>>>>> last
>> >> >>> >>>>>>>>>> piece is updating the scheduling logic to route tasks
>> from
>> >> a
>> >> >>> >>>>>> particular
>> >> >>> >>>>>>>>>> Bundle to the correct executor, which shouldn’t be much
>> >> work
>> >> >>> >>>>> (though
>> >> >>> >>>>>> it
>> >> >>> >>>>>>>>>> would be easier if the Task models had a column for the
>> >> team
>> >> >>> they
>> >> >>> >>>>>>> belong
>> >> >>> >>>>>>>>>> to, rather than having to look up the Dag and Bundle to
>> get
>> >> >>> the
>> >> >>> >>>>>> team) I
>> >> >>> >>>>>>>>>> have a branch where I was experimenting with this logic
>> >> >>> already.
>> >> >>> >>>>>>>>>> Any who, long story short, I don’t think we necessarily
>> >> need
>> >> >>> to
>> >> >>> >>>>>> remove
>> >> >>> >>>>>>>>>> this piece from the project's scope if it is already
>> partly
>> >> >>> done
>> >> >>> >>>>> and
>> >> >>> >>>>>>> not
>> >> >>> >>>>>>>>>> too difficult.
>> >> >>> >>>>>>>>>>
>> >> >>> >>>>>>>>> Yeah. I hear you here again. Certainly I would not want
>> to
>> >> just
>> >> >>> >>>>>>>>> **remove** it from the code. And, yep I totally forgot we
>> >> have
>> >> >>> it
>> >> >>> >>>>> in.
>> >> >>> >>>>>>> And
>> >> >>> >>>>>>>>> if we can make it in, easily (which it seems we can) - we
>> >> can
>> >> >>> also
>> >> >>> >>>>>>> include
>> >> >>> >>>>>>>>> it in the first iteration. What I wanted to avoid really
>> >> (from
>> >> >>> the
>> >> >>> >>>>>>> original
>> >> >>> >>>>>>>>> design) - again trying to simplify it, limit the changes,
>> >> and
>> >> >>> >>>>> speed up
>> >> >>> >>>>>>>>> implementation. And there is one "complexity" that I
>> wanted
>> >> to
>> >> >>> >>>>> avoid
>> >> >>> >>>>>>>>> specifically - having to have separate , additional
>> >> >>> configuration
>> >> >>> >>>>> per
>> >> >>> >>>>>>> team.
>> >> >>> >>>>>>>>> Not only because it complicates already complex
>> >> configuration
>> >> >>> >>>>> handling
>> >> >>> >>>>>>> (I
>> >> >>> >>>>>>>>> know we have PR for that) but mostly because if it is not
>> >> >>> needed,
>> >> >>> >>>>> we
>> >> >>> >>>>>> can
>> >> >>> >>>>>>>>> simplify documentation and explain to our users easier
>> what
>> >> >>> they
>> >> >>> >>>>> need
>> >> >>> >>>>>>> to do
>> >> >>> >>>>>>>>> to have their own multi-team setup. And I am quite open
>> to
>> >> >>> keeping
>> >> >>> >>>>>>>>> multiple-executors if we can avoid complicating
>> >> configuration.
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>> But I think some details of that and whether we really
>> need
>> >> >>> >>>>> separate
>> >> >>> >>>>>>>>> configuration might also come as a result of updating the
>> >> AIP
>> >> >>> - I
>> >> >>> >>>>> am
>> >> >>> >>>>>> not
>> >> >>> >>>>>>>>> quite sure now if we need it, but we can discuss it when
>> we
>> >> >>> >>>>> iterate on
>> >> >>> >>>>>>> the
>> >> >>> >>>>>>>>> AIP.
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>> J.
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>>>
>> >> >>> >>>>>>>
>> >> >>> >>>>>>>
>> >> >>>
>> ---------------------------------------------------------------------
>> >> >>> >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>> >> >>> >>>>>>> For additional commands, e-mail:
>> dev-h...@airflow.apache.org
>> >> >>> >>>>>>>
>> >> >>> >>>>>>>
>> >> >>> >>>>>>
>> >> >>> >>>>>
>> >> >>> >>>>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> ---------------------------------------------------------------------
>> >> >>> >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org
>> >> >>> >> For additional commands, e-mail: dev-h...@airflow.apache.org
>> >> >>>
>> >> >>>
>> >>
>> >
>>
>

Re: Discuss: AIP-67 (multi team) now that AIP-82 (External event driven dags) exists

Reply via email to