Any last comments ? There is a long weekend coming up in the US, so I will likely start voting on the updated AIP on Monday 7th.
On Fri, Jun 27, 2025 at 12:41 PM Jarek Potiuk <ja...@potiuk.com> wrote: > I'd really love to finalise discussion and put it up to a vote some time > after the recording from the last dev call is posted - so that more > context, details and the LONG discussion we had on it. There is no *huge* > hurry - we have strong dependency on Task Isolation and it seems that it > will still take a bit of time to complete, so I'd say I would love to start > voting in about a week time - so that maybe at the next dev call we can > "seal" the subject. Happy to see any more comments - especially from those > who have opinions but they had no opportunity to express them. > > I am personally very happy with the direction it took - simplification and > "MVP" kind of approach - also I invite the stakeholders of ours to take a > close look at the scope and what we really propose - I have a feeling that > we can balance it out - there is something we can make to make it not > "worse" for the offerings they have. I think we have a really good > symbiotic relationship here, and I would love to leverage that. For one - > my goal here is to have a minimum number of changes that are impacting > maintainability of the open-source airflow - but mostly "opening up some > possibilities" - rather than provide turn-key solutions. And mostly because > this is good for all sides - less maintenance and complexity for OSS > maintainers, but more opportunities to make it into "turn-key" solutions by > the stakeholders, while also allowing the "on-prem" users - if they are > highly motivated - to use those features by adding the "turn-key" layer on > their own. Also adding multi-team should not be at the expense of "simple" > installations - they should be virtually unaffected. > > One example of applying this is cutting on "separate config files". I > think it moves us closer to a "turn-key" solution but it is not really > necessary to achieve the three goals above - that's why in the current > proposal this part is completely removed - Sorry Niko, but I still think > it's one of the things that falls into this bucket. We can easily remove > it, they complicate code, documentation and options the users have, and > even if it is a "little" more complex to manage configuration by motivated > users, it's also an opportunity for "turn-key" option that stakeholders can > build in their products - and we do not have to maintain it in the > open-source. So I would be rather strong on **not** touching the current > configuration and simply adding configuration for per-team executors in > executor config - even if it is uglier and more "low-level". > > So if there are some constructive ideas on what can be done to make it > "simpler" and less "turn-key" in that respect - I would highly value such > ideas and comments. If we can cut down something more that is not > "necessary" for the three primary goals I came up with - I am more than > happy to do it. > > Just to remind - those are the "extracted" goals. I slightly updated them > and added to the preamble of the AIP: > > * less operational overhead for managing multi-team (once AIP-72 is > complete) where separate execution environments are important > * virtual assets sharing between teams > * ability of having "admin" and "team sharing" capability where dags from > multiple teams can be seen in a single Airflow UI (requires custom RBAC an > AIP-56 implementation of Auth Manager - with KeyCloak Auth Manager being a > reference implementation) > > J. > > > On Thu, Jun 26, 2025 at 10:53 AM Jarek Potiuk <ja...@potiuk.com> wrote: > >> >>> One technical observation: Now that the dag table no longer has a >>> team_id in it, what would the behaviour be when a DAG is attempted to move >>> between bundles? How do we detect this? (I’m not all convinced that we >>> correctly detect duplicate dag ids across bundles today, so I wouldn’t >>> assume or rely on the current behaviour.) >>> >> >> Of course - yes, I realise that - that problem was also not handled in >> the previous iteration to be honest. That is something that dag bundle >> solution allows to solve eventually - but I do not think it's a blocker for >> the proposed implementation. We will have to eventually add some way of >> blocking dags to jump between bundles, we might tackle this separately. I >> already wanted to propose a separate update to that - but I did not want to >> complicate the current proposal. One thing at a time. I can, however - if >> you consider that as a blocker, extend the current AIP with it. Not a big >> problem. This is however a bit independent from the team_id introduction. >> >> Overall, I am still unconvinced this proposal has enough real user >>> benefit over actually separate deployments, and on balance of the added >>> complexity and maintenance burden I do not think it is worth it. >>> >> >> That makes me sad, I thought that over the course of the discussion I >> addressed all the concerns (in this case the concern was "is it worth with >> the cost and little benefit", but when I did it and heavily limited the >> impact, now the concern is "is it worth at all as changes are really >> minimal" - and surely, anyone can change and adapt their concerns, over >> time, but that one seems like ever-moving target. I hoped at least for >> some acknowledgment of some concerns (complexity in this case) is >> addressed, but it seems that you are deeply convinced that we do not need >> multi-team at all (which is in stark contrast with at least a dozen of >> bigger and smaller users of Airflow who submitted talks to Airflow summit >> (including about 5 or 6 submissions for Airflow 2025) on how they spent >> their engineering effort, time and money on trying to achieve something >> similar - they assessed that it's worth, you assess that it's not worth. >> Somehow I trust our users that they were not spending the money, time and >> engineering effort to achieve this because they wanted to spend more money. >> I think they assessed it's worth it. So I want to make it a bit easier and >> more "proper" way for them to do that. >> >>> >>> Upgrades: it is not easier to upgrade under this multi team proposal, >>> but much much harder. This is based on hard earned experience from helping >>> Astronomer users — having to coordinate upgrades between multiple teams >>> turns in to a months long slog of the hardest kind of work — people work: >>> getting other teams to agree to do things that they don’t directly care >>> about — “It’s working for me, I don’t care about upgrading, we’ll get to it >>> next quarter” is a refrain I’ve heard many times. >>> >> >> Yes. absolutely - this is why we deferred it until we knew what shape >> task isolation and other AIPs we depend on take on. Because it is clear >> that pretty much all the problem you explain above are going to be solved >> with task isolation. And it's not just my opinion. If you want to argue >> with it, you likely need to argue with yourself: >> https://github.com/apache/airflow/issues/51545#issuecomment-2980038478. >> Let me quote what you wrote there last week: >> >> Ash Berlin Taylor wrote: >> >> > A tight coupling between task-sdk and any "server side" component is >> the opposite to one of the goals of AIP-72 (I'm not sure we ever explicitly >> said this, but the first point of motivation for the AIP says "Dependency >> conflicts for administrators supporting data teams using different versions >> of providers, libraries, or python packages") >> > In short, my goal with TaskSDK, and the reason for introducing CalVer >> and Cadwyn with the execution API is to end up in a world where you can >> upgrade the Airflow Scheduler/API server interdependently of any worker >> nodes (with the exception that the server must be at least as new as the >> clients) >> > This ability to have version-skew is pretty much non-negotiable to me >> and is (other than other languages) one of primary benefits of AIP-72 >> >> If you read yourself from that quote it basically means "it will be easy >> to upgrade airflow independently of workers". So I am a bit confused here. >> Yes, I agree it was difficult, but you yourself explain that when AIP-72 >> (which since API-67 has been accepted has always beem prerequisite of it) >> wrote it will be "easy". So I am not sure why you are bringing it now. We >> assume AIP-72 will be completed and this problem will be gone. Let's not >> mention it any more please. >> >> The true separation from TaskSDK will likely only land in about 3.2 time >>> frame. We are actively working on it, but it’s a slow process of untangling >>> lots of assumptions made in the code base over the years. Maybe once we >>> have that my view would be different, but right now I think this makes the >>> proposal a non-starter. Especially as you are saying that most teams will >>> have unique connections. If they’ve got those already, then having an asset >>> trigger use those conns to watch/poll for activity is a much easier >>> solution to operate and crucially, to scale and upgrade. >>> >> >> Yes. I perfectly understand that and I am fully aware of potentially 3.2 >> time-frame. And that's fine. Actually I heartily invite you to listen to >> the part of my talk from Berlin Buzzwords when I was asked for the timeline >> - https://youtu.be/EyhZOnbwc-4?t=2226 - this link leads to the exact >> timeline in my talk . My answer was basically - "3.1" or "3.2", and I >> sincerely hope "3.1" but we might not be able to complete it because we >> have other things to do (other - is indeed the Task Isolation work that you >> are leading). And that's perfectly fine. And it absolutely does not prevent >> us from voting on the AIP now - similarly as we voted on the previous >> version of the AIP - knowing that it has some prerequisites a few months >> ago. Especially that we know that the feature we need from task isolation >> is "non-negotiable". I.e. it WILL happen. We don't hope for it, we know it >> will be there. Those are your own words. >> >> >>> > I think we can’t compare AIP-82 to sharing virtual assets due to >>> complexity of it. >>> >>> Virtual Assets was a mistake, and not how users actually want to use >>> them. Mea culpa >>> >> >> This is the first time I hear this - certainly you never raised this >> concern on the devlist. So if you have some concerns about virtual assets I >> think you should raise it on the devlist, because I think everyone here is >> missing some conversation (or maybe it's just your private opinion that >> you never shared with anyone, but maybe it's worth). I would be >> interested to hear how the feature that was absolutely most successful >> feature of airflow 2 was a mistake. According to the 2024 survey >> https://airflow.apache.org/blog/airflow-survey-2024/ - 48% of Airflow >> users have been using it, even if it was added as one of the last >> big features of Airflow 2. It's the MOST used feature out of all the >> features out there. I would be really curious to see how it was a mistake >> (but please start a separate thread explaining why you think it was a >> mistake, what are your data points and what do you think should be fixed. >> Just dropping "virtual assets were a mistake" in the middle of multi-team >> conversation seems completely unjustified without knowing what you are >> talking about. So I think, until we know more, this argument has no base. >> >> >>> >>> S >>> To restate my points: >>> >>> - Sharing a deployment between teams today/in 3.1 is operationally more >>> complex (both scaling, and upgrades) — this is a con, not a plus. >>> >> >> Surely. But it will be easier when AIP-72 is complete (which I am >> definitely looking forward to and as clearly explained in AIP-82, is a >> prerequisite of it). Nothing changed here. >> >> >>> - The main user benefit appears to be “allow teams’ DAGs to communicate >>> via Assets”, in which case we can do that today by putting more work in to >>> AIP-82’s Asset triggers >>> >> >> No. Lower operational complexity for multi-teams (providing that we >> deliver AIP-72) is another benefit. Virtual assets is another, and since >> there is no ground in "virtual assets is a mistake" statement (not until >> you explain what you mean by that in a separate discussion) - this is also >> still a very valid point. >> >> >>> Soon, we will have then be asked about cross-team governance, policy >>> enforcement, and potentially unbounded edge cases (e.g., team-specific >>> secrets, roles, quotas). ain, you get this for free with truely separate >>> deployments already >>> allow different teams to use different executors (including multiple >>> executors per-team following AIP-61) >>> >> >> Not really. We very explicitly say in the AIP that his is not a goal and >> that we have no plans for. And yes, using separate executors per team is >> actually back in the AIP-82 in case you did not notice (and the code needed >> for it's even implemented and merged already in main by Vincent). >> >> >>> Provably not true right now, and until ~3.2 delivers the full Task >>> SDK/Core dependency separation this would be _more_ work to upgrade, not >>> less, and that work is not shared but still on a central team. >>> >> >> Absolutely - we will wait for AIP-72 completion. I do not want to say 3.1 >> or 3.2 directly - because there are - as you said - a lot of moving pieces. >> So my target for multi-team is "After AIP-72 is completed". Full stop. But >> there is nothing wrong with accepting the AIP now and doing preparatory >> work in parallel. Similarly as there is no way to have a baby in 1 month by >> 9 women, there is no way adding more effort to task-sdk isolation will >> speed it up - we alredy have not only 3 people (you leading it, Kaxil and >> Amog) but also all the help from me and even 10s of different contributors >> (for example with the recent db_test cleanup that I took leadership on) - >> and there are people who wish to work on adding multi-team features. Since >> the design heavily limits impact on airflow codebase and interactions with >> task-sdk implementation, there is nothing wrong with starting >> implementation in parallel either- amazon team is keen to move it forward - >> they even already implemented SQS trigger for assets, and we are working >> together on FAB removal, Keycloak authentication manager - and they seem to >> still have capacity and drive to progress multi-team. So I am not sure if >> we are trading off something. There is no "if we work on more on task sdk >> and drop multi-team things will be faster". Generally in open source people >> work in the area where they feel they can provide best value - such as you >> working on task-sdk, me on CI,dev env, they will deliver more value on >> multi-team >> >> >>> >>> So please, as succinctly as possible, please tell me what the direct >>> benefit to users this proposal is over us putting this effort in to writing >>> better Asset triggers instead? >>> >> >> >> * less operational overhead for managing multi-team (once AIP-72 is >> complete) where separate execution environments are important >> * virtual assets sharing >> * ability of having "admin" and "team sharing" capability where dags from >> multiple teams can be seen in a single Airflow UI (requires custom RBAC) >> >> None of this can be done via beter asset triggers >> >> >>> >>> > On 23 Jun 2025, at 10:57, Jarek Potiuk <ja...@potiuk.com> wrote: >>> > >>> > My counter-points: >>> > >>> > >>> >> 1. Managing a multi team deployment is not materially different from >>> >> managing a deployment per team >>> >> >>> > >>> > It's a bit easier - especially when it comes to upgrades (especially >>> in the >>> > case we are targetting when we are not targetting multi-tenant, but >>> several >>> > relatively closely cooperating teams with different dependncy >>> requiremens >>> > and isolation need. >>> > >>> > 2. The database changes were quite wide-reaching >>> >> >>> > >>> > Yes. that is addressed. >>> > >>> > >>> >> 3. I don’t believe the original AIP (again, I haven’t read the updated >>> >> proposal or recent messages on the thread. yet) will meet what many >>> users >>> >> want out of a multiteam solution >>> >> >>> > >>> > I think we will only see when we try. A lot of people thing they would, >>> > even if they are warned. I know at least one user (Wealthsimple) who >>> > definitely want to use it and they got a very detailed explanation of >>> the >>> > idea and understand it well. So I am sure that **some** users would. >>> But we >>> > do not know how many. >>> > >>> > >>> >> To expand on those points a bit more >>> >> >>> >> On 1. The only components that are shared are, I think, the scheduler >>> and >>> >> the API server, and it’s arguable if that is actually a good idea >>> given >>> >> those are likely to be the most performance sensitive components >>> anyway. >>> >> >>> >> Additionally the fact that the scheduler is a shared component makes >>> >> upgrading it almost a non starter as you would likely need buy-in, >>> changes, >>> >> and testing form ALL teams using it. I’d argue that this is a huge >>> negative >>> >> until we finish off the version indepence work of AIP-72. >>> >> >>> > >>> > Quite disagree here - especially that our target is that task-sdk is >>> > supposed to provide all isolation that is needed. There should be 0 >>> changes >>> > in the dags needed to upgrade scheduler, api_server, triggerer - >>> precisely >>> > because we introduced backwards-compatible task-sdk. >>> > >>> > On 3 my complaint is essentially that this doesn’t go nearly far >>> enough. It >>> >> doesn’t allow read only views to other teams dags. I don’t think it >>> allows >>> >> you to be in multiple teams at once. You can’t share a connection >>> between >>> >> teams but only allow certain specified dags to access it, but would >>> have to >>> >> either be globally usable, or duplicated-and-kept-in-sync between >>> teams. In >>> >> short I think it fall short of being useful.. >>> >> >>> > >>> > Oh absolutely all that is possible (except sharing single connections >>> > between multiple teams - which is a very niche use cases and >>> duplication >>> > here is perfectly ok as first approximation - and if we need more we >>> can >>> > add it later). >>> > >>> > Auth manager RBAC and access is abstracted away, and the Keyclock >>> Manager >>> > implemented by Vincent allows to manage completely independent and >>> separate >>> > RBAC based on arguments and resources provided by Airflow. There is >>> nothing >>> > to prevent the user who configures KeyCloak RBAC to define it in the >>> way: >>> > >>> > if group a > allow to read a and write b >>> > if group b > alllow to write b but not a >>> > >>> > and any other combinations. KeyCloak implementation - pretty advanced >>> > already - (and design of auth manager) completely abstracts away both >>> > authentication and authorization to KeyCloak and KeyCloak has RBAC >>> > management built in. Also any of the users can write their own - even >>> > hard-coded authentication manager to do the same if they do not want to >>> > have configurable KeyCloak. Even SimpleAuthManager could be hard-coded >>> to >>> > provide thiose features. >>> > >>> > >>> >> >>> >> So on the surface, I’m no more in favour of using dag bundle as a >>> >> replacement for team id as I think most of the above points still >>> stand. >>> >> >>> > >>> > We disagree here. >>> > >>> >> >>> >> My counter proposal: We do _nothing_ to core airflow. We work on >>> improving >>> >> the event-based trigger o fdags (write more triggers for read/check >>> remote >>> >> Assets etc) so that teams can have 100% isolated deployments but still >>> >> trigger dags based on asset events from other teams. >>> >> >>> > >>> > That does not solve any of the other design goals - only allows to >>> trigger >>> > assets a bit more easily (but also it's not entirely solved by AIP-82 >>> > because it does not solve virtual assets - only ones that have defined >>> > triggerer and "something" to listen on - which is way more complex than >>> > just defining asset in a Dag and using it in another). I think we can't >>> > compare AIP-82 to sharing virtual assets due to complexity of it. I >>> > explained it in the doc. >>> > >>> > >>> > I will now go and catch up with the long thread and updated proposal >>> and >>> >> come back. >>> >> >>> > >>> > Please. I hope the above explaination will help in better >>> understanding of >>> > the proposal, because I think you had some assumptions that do not >>> hold any >>> > more with the new proposal. >>> > >>> > J. >>> > >>> > >>> >> >>> >>> On 23 Jun 2025, at 05:54, Jarek Potiuk <ja...@potiuk.com> wrote: >>> >>> >>> >>> Just to clarify the relation - I updated the AIP now to refer to >>> AIP-82 >>> >> and >>> >>> to explain relation between the "cross-team" and "cross-airflow" >>> asset >>> >>> triggering - this is what I added: >>> >>> >>> >>> Note that there is a relation between AIP-82 ("External Driven >>> >> Scheduling") >>> >>> and this part of the functionality. When you have multiple instances >>> of >>> >>> Airflow, you can use shared datasets - "Physical datasets" - that >>> several >>> >>> Airflow Instances can use - for example there could be an S3 object >>> that >>> >> is >>> >>> produced by one airflow instance, and consumed by another. That >>> requires >>> >>> deferred trigger to monitor for such datasets, and appropriate >>> >> permissions >>> >>> to the external dataset, and you could achive similar result to >>> >> cross-team >>> >>> dataset triggering (but cross airflow). However the feature of >>> sharing >>> >>> datasets between the teams also works for virtual assets, that do not >>> >> have >>> >>> physically shared "objects" and trigger that is monitoring for >>> changes in >>> >>> such asset. >>> >>> >>> >>> J. >>> >>> >>> >>> >>> >>> On Mon, Jun 23, 2025 at 6:38 AM Jarek Potiuk <ja...@potiuk.com> >>> wrote: >>> >>> >>> >>>>> From a quick glance, the updated AIP didn't seem to have any >>> reference >>> >> to >>> >>>>> AIP-82, which surprised me, but will take a more detailed read >>> through. >>> >>>> >>> >>>> Yep. It did not - because I did not think it was needed or even very >>> >>>> important after the simplifications. AIP-82 has a different scope, >>> >> really. >>> >>>> It only helps when the Assets are "real" data files which we have >>> >> physical >>> >>>> triggers for, it's slightly related - sharing datasets between teams >>> >>>> (including those that do not require physical files and triggers) is >>> >> still >>> >>>> possible in the design we have now, but it's not (and never was) the >>> >>>> **only** reason for having multi-team. There always was (and still >>> is) >>> >> the >>> >>>> possibility of having a common, distinct environments (i.e. >>> dependencies >>> >>>> and providers) per team, the possibility of having connections and >>> >>>> variables that are only accessible to one team and not the other, >>> and >>> >>>> isolating workload execution (all that while allowing to manage >>> multiple >>> >>>> team and schedule things with single deployment). That did not >>> change. >>> >> What >>> >>>> changed a lot is that it is now way simpler, something that we can >>> >>>> implement without heavy changes to the codebase - and give it to our >>> >> users, >>> >>>> so that they can assess if this is something they need without too >>> much >>> >>>> risk and effort. >>> >>>> >>> >>>> This was - I believe the main concern, that the value we get from >>> it is >>> >>>> not dramatic, but the required changes are huge. This "redesign" >>> changes >>> >>>> the equation - the value is still unchanged, but the cost of >>> >> implementing >>> >>>> it and impact on the Airflow codebase is much smaller. I still have >>> not >>> >>>> heard back from Ash if my proposal responds to his original concern >>> >> though, >>> >>>> so I am mostly guessing (also based on the positive impact of >>> others) >>> >> that >>> >>>> yes it does. But to be honest I am not sure and I would love to hear >>> >> back, >>> >>>> I decided to update the AIP to reflect it - regardless, because I >>> think >>> >> the >>> >>>> simplification I proposed keeps the original goals, but is indeed >>> way >>> >>>> simpler. >>> >>>> >>> >>>>> This is a very difficult thread to catch up on. >>> >>>> >>> >>>> Valid point. Let me summarize what is the result: >>> >>>> >>> >>>> * I significantly simplified the implementation proposal comparing >>> to >>> >> the >>> >>>> original version >>> >>>> * main simplification is very limited impact on existing database - >>> >>>> without "ripple effect" that would require us to change a lot of >>> tables, >>> >>>> including their primary keys, and heavily impact the UI >>> >>>> * this is now more of an incremental change that can be implemented >>> way >>> >>>> faster and with far less risk >>> >>>> * updated idea is based on leveraging bundles (already part of our >>> data >>> >>>> model) to map them (many-to-one) to a team - which requires to just >>> >> extend >>> >>>> the data model with bundle mapping and add team_id to connections >>> and >>> >>>> variables. Those are all needed DB changes. >>> >>>> >>> >>>> The AIP is updated - in a one single big change so It should be >>> easy to >>> >>>> compare the changes: >>> >>>> >>> >> >>> https://cwiki.apache.org/confluence/pages/viewpreviousversions.action?pageId=294816378 >>> >>>> -> I even named the version appropriately "Simplified multi-team >>> AIP" - >>> >> you >>> >>>> can select and compare v.65 with v.66 to see the exact differences I >>> >>>> proposed. >>> >>>> >>> >>>> I hope it will be helpful to catch up and for those who did not >>> follow, >>> >> to >>> >>>> be able to make up their minds about it. >>> >>>> >>> >>>> J. >>> >>>> >>> >>>> >>> >>>> >>> >>>> On Mon, Jun 23, 2025 at 4:35 AM Vikram Koka >>> >> <vik...@astronomer.io.invalid> >>> >>>> wrote: >>> >>>> >>> >>>>> This is a very difficult thread to catch up on. >>> >>>>> I will take a detailed look at the AIP update to try to figure out >>> the >>> >>>>> changes in the proposal. >>> >>>>> >>> >>>>> From a quick glance, the updated AIP didn't seem to have any >>> reference >>> >> to >>> >>>>> AIP-82, which surprised me, but will take a more detailed read >>> through. >>> >>>>> >>> >>>>> Vikram >>> >>>>> >>> >>>>> >>> >>>>> >>> >>>>> On Sun, Jun 22, 2025 at 1:44 AM Pavankumar Gopidesu < >>> >>>>> gopidesupa...@gmail.com> >>> >>>>> wrote: >>> >>>>> >>> >>>>>> Thanks Jarek, that's a great update on this AIP, now it's much >>> more >>> >> slim >>> >>>>>> down. >>> >>>>>> >>> >>>>>> left a minor comment. :) Overall looking great. >>> >>>>>> >>> >>>>>> Pavan >>> >>>>>> >>> >>>>>> On Sat, Jun 21, 2025 at 3:10 PM Jens Scheffler >>> >>>>> <j_scheff...@gmx.de.invalid >>> >>>>>>> >>> >>>>>> wrote: >>> >>>>>> >>> >>>>>>> Thanks for the rework/update of the AIP-72! >>> >>>>>>> >>> >>>>>>> Just a few small comments but overall I like it as it is much >>> leaner >>> >>>>>>> than originally planned and is in a level of complexity that it >>> >> really >>> >>>>>>> seems to be a benefit to close the gap as described. >>> >>>>>>> >>> >>>>>>> On 21.06.25 14:52, Jarek Potiuk wrote: >>> >>>>>>>> I updated the AIP - including architecture images and reviewed >>> it >>> >>>>>> (again) >>> >>>>>>>> and corrected any ambiguities and places where it needed to be >>> >>>>> changed. >>> >>>>>>>> >>> >>>>>>>> I think the current state >>> >>>>>>>> >>> >>>>>>> >>> >>>>>> >>> >>>>> >>> >> >>> https://cwiki.apache.org/confluence/display/AIRFLOW/AIP-67+Multi-team+deployment+of+Airflow+components >>> >>>>>>>> - nicely describes the proposal. >>> >>>>>>>> >>> >>>>>>>> Comparing to the previous one: >>> >>>>>>>> >>> >>>>>>>> 1. The DB changes are far less intrusive - no ripple effect on >>> >>>>> Airflow >>> >>>>>>>> 2. There is no need to merge configurations and provide >>> different >>> >>>>> set >>> >>>>>> of >>> >>>>>>>> configs per team - we can add it later but I do not see why we >>> need >>> >>>>> it >>> >>>>>> in >>> >>>>>>>> this simplified version >>> >>>>>>>> 3. We can still configure a different set of executors per team >>> - >>> >>>>> that >>> >>>>>> is >>> >>>>>>>> already implemented (we just need to wire it to the bundle -> >>> team >>> >>>>>>> mapping). >>> >>>>>>>> >>> >>>>>>>> I think it will be way simpler and faster to implement this way >>> and >>> >>>>> it >>> >>>>>>>> should serve as MVMT -> Minimum Viable Multi Team that we can >>> give >>> >>>>> our >>> >>>>>>>> users so that they can provide feedback. >>> >>>>>>>> >>> >>>>>>>> J. >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> On Fri, Jun 20, 2025 at 8:33 AM Jarek Potiuk <ja...@potiuk.com> >>> >>>>> wrote: >>> >>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>>> I like this iteration a bit more now for sure, thanks for >>> being >>> >>>>>>> receptive >>> >>>>>>>>>> to feedback! :) >>> >>>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>>> This now becomes quite close to what was proposing before, we >>> now >>> >>>>>> again >>> >>>>>>>>>> have a team ID (which I think is really needed here, glad to >>> see >>> >>>>> it >>> >>>>>>> back) >>> >>>>>>>>>> and it will be used for auth management, configuration >>> >>>>> specification, >>> >>>>>>> etc >>> >>>>>>>>>> but will be carried by Bundle instead of the dag model. Which >>> as >>> >>>>> you >>> >>>>>>> say >>> >>>>>>>>>> “For that we will need to make sure that both api-server, >>> >>>>> scheduler >>> >>>>>> and >>> >>>>>>>>>> triggerer have access to the "bundle definition" (to perform >>> the >>> >>>>>>> mapping)" >>> >>>>>>>>>> which honestly doesn’t feel too much different from the >>> original >>> >>>>>>> proposal >>> >>>>>>>>>> we had last week of adding it to Dag table and ensuring it’s >>> >>>>>> available >>> >>>>>>>>>> everywhere. but either way I’m happy to meet in the middle and >>> >>>>> keep >>> >>>>>> it >>> >>>>>>> on >>> >>>>>>>>>> Bundle if everyone else feels that’s a more suitable location. >>> >>>>>>>>>> >>> >>>>>>>>> I think the big difference is the "ripple effect" that was >>> >>>>> discussed >>> >>>>>> in >>> >>>>>>>>> >>> https://lists.apache.org/thread/78vndnybgpp705j6sm77l1t6xbrtnt5c >>> >>>>>> (and I >>> >>>>>>>>> believe - correct me if I am wrong Ash - important trigger for >>> the >>> >>>>>>>>> discussion) so far what we wanted is to extend the primary key >>> and >>> >>>>> it >>> >>>>>>> would >>> >>>>>>>>> ripple through all the pieces of Airflow -> models, API, UI >>> etc. >>> >>>>> ... >>> >>>>>>>>> However - we already have `bundle_name" and "bundle_version" >>> in the >>> >>>>>> Dag >>> >>>>>>>>> model. So I think when we add a separate table where we map the >>> >>>>> bundle >>> >>>>>>> to >>> >>>>>>>>> the team, the "ripple effect" will be almost 0. We do not want >>> to >>> >>>>>> change >>> >>>>>>>>> primary key, we do not want to change UI in any way (except >>> >>>>> filtering >>> >>>>>> of >>> >>>>>>>>> DAGs available based on your team - but that will be handled in >>> >>>>> Auth >>> >>>>>>>>> Manager and will not impact UI in any way, I think that's a >>> huge >>> >>>>>>>>> simplification of the implementation, and if we agree to it - i >>> >>>>> think >>> >>>>>> it >>> >>>>>>>>> should speed up the implementation significantly. There are >>> only a >>> >>>>>>> limited >>> >>>>>>>>> number of times where you need to look up the team_id - so >>> having >>> >>>>> the >>> >>>>>>>>> bundle -> team mapping in a separate table and having to look >>> them >>> >>>>> up >>> >>>>>>>>> should not be a problem. And it has much less complexity and >>> >>>>>>>>> "ripple-effect" through the codebase (for example I could >>> imagine >>> >>>>> 100s >>> >>>>>>> or >>> >>>>>>>>> thousands already written tests that would have to be adapted >>> if we >>> >>>>>>> changed >>> >>>>>>>>> the primary key - where there will be pretty much zero impact >>> on >>> >>>>>>> existing >>> >>>>>>>>> tests if we just add bundle -> team lookup table. >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>>>>> One other thing I’d point out is that I think including >>> executors >>> >>>>> per >>> >>>>>>>>>> team is a very easy win and quite possible without much work. >>> I >>> >>>>>> already >>> >>>>>>>>>> have much of the code written. Executors are already aware of >>> >>>>> Teams >>> >>>>>>> that >>> >>>>>>>>>> own them (merged), I have a PR open to have configuration per >>> team >>> >>>>>>> (with a >>> >>>>>>>>>> quite simple and isolated approach, I believe you approved >>> Jarek). >>> >>>>>> The >>> >>>>>>> last >>> >>>>>>>>>> piece is updating the scheduling logic to route tasks from a >>> >>>>>> particular >>> >>>>>>>>>> Bundle to the correct executor, which shouldn’t be much work >>> >>>>> (though >>> >>>>>> it >>> >>>>>>>>>> would be easier if the Task models had a column for the team >>> they >>> >>>>>>> belong >>> >>>>>>>>>> to, rather than having to look up the Dag and Bundle to get >>> the >>> >>>>>> team) I >>> >>>>>>>>>> have a branch where I was experimenting with this logic >>> already. >>> >>>>>>>>>> Any who, long story short, I don’t think we necessarily need >>> to >>> >>>>>> remove >>> >>>>>>>>>> this piece from the project's scope if it is already partly >>> done >>> >>>>> and >>> >>>>>>> not >>> >>>>>>>>>> too difficult. >>> >>>>>>>>>> >>> >>>>>>>>> Yeah. I hear you here again. Certainly I would not want to just >>> >>>>>>>>> **remove** it from the code. And, yep I totally forgot we have >>> it >>> >>>>> in. >>> >>>>>>> And >>> >>>>>>>>> if we can make it in, easily (which it seems we can) - we can >>> also >>> >>>>>>> include >>> >>>>>>>>> it in the first iteration. What I wanted to avoid really (from >>> the >>> >>>>>>> original >>> >>>>>>>>> design) - again trying to simplify it, limit the changes, and >>> >>>>> speed up >>> >>>>>>>>> implementation. And there is one "complexity" that I wanted to >>> >>>>> avoid >>> >>>>>>>>> specifically - having to have separate , additional >>> configuration >>> >>>>> per >>> >>>>>>> team. >>> >>>>>>>>> Not only because it complicates already complex configuration >>> >>>>> handling >>> >>>>>>> (I >>> >>>>>>>>> know we have PR for that) but mostly because if it is not >>> needed, >>> >>>>> we >>> >>>>>> can >>> >>>>>>>>> simplify documentation and explain to our users easier what >>> they >>> >>>>> need >>> >>>>>>> to do >>> >>>>>>>>> to have their own multi-team setup. And I am quite open to >>> keeping >>> >>>>>>>>> multiple-executors if we can avoid complicating configuration. >>> >>>>>>>>> >>> >>>>>>>>> But I think some details of that and whether we really need >>> >>>>> separate >>> >>>>>>>>> configuration might also come as a result of updating the AIP >>> - I >>> >>>>> am >>> >>>>>> not >>> >>>>>>>>> quite sure now if we need it, but we can discuss it when we >>> >>>>> iterate on >>> >>>>>>> the >>> >>>>>>>>> AIP. >>> >>>>>>>>> >>> >>>>>>>>> J. >>> >>>>>>>>> >>> >>>>>>>>> >>> >>>>>>> >>> >>>>>>> >>> --------------------------------------------------------------------- >>> >>>>>>> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >>> >>>>>>> For additional commands, e-mail: dev-h...@airflow.apache.org >>> >>>>>>> >>> >>>>>>> >>> >>>>>> >>> >>>>> >>> >>>> >>> >> >>> >> >>> >> --------------------------------------------------------------------- >>> >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >>> >> For additional commands, e-mail: dev-h...@airflow.apache.org >>> >>>