> Other than the feature being a consistent request on our Airflow surveys, we > have a number of users that have asked and continue to ask when a multi-team > solution would be available in Airflow
This is precisely one of my points. I don’t believe that AIP-67 as in the wiki page will address the need of many users asking for multi team. (Ask three Airflow users what they want from multi team, and you’ll get 5 different answers) As far as I can work out, and again, please correct me if I'm wrong, the only real difference to users fro the multi-team solution over running multiple Airflows is the ability to “communicate” via Datasets/Assets. (Variables aren’t shared, Connections aren’t shared, workers aren’t shared. Webserver and Scheduler could be/are shared but reducing resource consumption of a deployment is explicitly not a goal) And if that is really all this AIP delivers to us, then my hypothesis is that we’ll a) miss the mark on what many users actually want from multi-team, and b) that we could already get the “communicate between two teams DAGs” benfit today with no changes to Airflow by using the AssetWatcher that Vincent already added to 3.0.0. -ash > On 12 Jun 2025, at 16:24, Bishundeo, Rajeshwar <rbish...@amazon.com.INVALID> > wrote: > > Ash, you've raised some good points on the need to re-evaluate AIP-67, > although I'm a bit confused on how AIP-82 factors into a multi-team solution. > It's fair to have the discussion on how Airflow has changed and perhaps > either redefining what AIP-67 means...or a set of new AIP's solving a subset > of a larger need. > We have seen talks at previous ( and even at the upcoming summit) where users > have demonstrated their implementation of multi-team. I can't help but feel > that they are being creative with some of those solutions (not that there's > anything wrong with that), but because one doesn't exist in Airflow. Other > than the feature being a consistent request on our Airflow surveys, we have a > number of users that have asked and continue to ask when a multi-team > solution would be available in Airflow. > I think the next dev call (06/26) is great time to dive into this further. > > -- Rajesh > > > > > > > On 2025-06-12, 9:34 AM, "Jarek Potiuk" <ja...@potiuk.com > <mailto:ja...@potiuk.com> <mailto:ja...@potiuk.com>> wrote: > > > CAUTION: This email originated from outside of the organization. Do not click > links or open attachments unless you can confirm the sender and know the > content is safe. > > > > > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. Ne > cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez pas > confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que le > contenu ne présente aucun risque. > > > > > > > Yep. It's a valid point that we should re-evaluate things now after Airflow > 3 is out - the reason why we delayed it was that we wanted to get more > clarity on what implementation and scope of the Airflow 3 changes will be > and see how it fits. > > > I wonder what others - especially those who run Airflow at scale and > hear the users asking for different forms of multi-team - would say to the > expectations they have and how they map to Airflow 3 - maybe indeed we > might come up with a simpler way of achieving those expectations. > > > Definitely worth discussing it. > > > J. > > > > > On Thu, Jun 12, 2025 at 2:15 PM Ash Berlin-Taylor <a...@apache.org > <mailto:a...@apache.org> <mailto:a...@apache.org>> wrote: > > >> Hi everyone, >> >> One thing I’ve been struggling with while reading the other thread about >> multi-team DB changes[0] is what is the end-user problem we are trying to >> address with it. >> >> The main impetus for opening this discussion is that a lot has changed in >> Airflow since this AIP was created in early 2024 and voted on mid-2024, and >> I'm wondering if those changes are big enough to invalidate the design and >> assumptions made at the time. >> >> Reading the DB changes thread I see that that changes are far reaching and >> necessarily have to touch most of the Airflow object models, and this got >> me thinking about what value do we actually get with the change, since as >> stated in the AIP some of the non-goals are[1] (slightly edited here for >> brevity with the “[…]"): >> >> >>> • Sharing broker/backend for celery executors between teams. This MAY be >> covered by future AIPs >>> • Implementation of FAB-based multi-team Auth Manager. […] >>> • Per-team concurrency and prioritization of tasks. […]. >>> • Resource allocation per-executor. In the current proposal, executors >> are run as sub-processes of Scheduler and we have very little control over >> their individual resource usage. […] >>> • Turn-key multi-team Deployment of Airflow (for example via Helm >> chart). This is unlikely to happen.[…] >>> • team management tools (creation, removal, rename etc.). […] >>> • Combining "global" execution with "team" execution. While it should be >> possible in the proposed architecture to have a "team" execution and >> "global" execution in a single instance of Airflow, this has it's own >> unique set of challenges and assumption is that Airflow Deployment is >> either "global" (today) or "multi-team" (After this AIP is implemented) - >> but it cannot be combined (yet). This is possible to be implemented in the >> future. >>> • Running multiple schedulers - one-per team. While it should be >> possible if we add support to select DAGs "per team" per scheduler, this is >> not implemented in this AIP and left for the future >> >> And also Design Non-goals from the AIP [2]: >> >>> • It’s not a primary goal of this proposal to significantly decrease >> resource consumption for Airflow installation compared to the current ways >> of achieving “multi-tenant” setup. […] >>> • It’s not a goal of the proposal to provide a one-stop installation >> mechanism for “Multi-team” Airflow. […] >>> • It’s not a goal to decrease the overall maintenance effort involved in >> responding to needs of different teams, […] >> >> The main pain point that we seem to be addressing with this AIP is this[3]: >> >>> The main reason for having multi-team deployment of Airflow is achieving >> security and isolation between the teams, coupled with ability of the >> isolated teams to collaborate via shared Datasets. >> >> >> So what’s changed since we collectively (myself included) voted on and >> accepted this AIP? Well, we now have AIP-82 — External event driven dags. >> That could be used to achieve this goal right now in 3.0 with no changes to >> Airflow itself, and is perhaps a more robust mechanism of doing it too. >> >> So my main question, given the wide reaching code changes need for AIP-67, >> and (IMO) the imperfect/limited scope of team completion I wonder if using >> AIP-82 would not be a better solution to the problem. >> >> 1. It’s much simpler from a code level, as nothing need to change >> 2. It’s not _that_ much more complex from an operational point of view >> (you have to run an extra scheduler and web server, but those would likely >> need scaling up.) >> 3. We won’t disappoint people by not implementing the part of multi-team >> that they want (Someone being part of multiple teams, sharing >> connections/vars between teams) >> >> And using this mechanism (of external dataset/asset polling) also negates >> one of the biggest cons of the AIP-67, that of the tight coupling of >> Airflow versions between the teams. In larger companies this is a _huge_ >> problem already, and this would only make it worse. >> >> So what’s my idea (and at this stage is it only an idea for discussion) is >> that we re-evalute AIP-67 in light of what exists in Airflow 3.0 now and >> decide if it’s still worth the added complexity of DB, code and operational >> overhead, and decided if we still want it. >> >> Please, please, please point out if there are other benefits that I have >> missed, I'm not trying to be selective and get my way, I'm trying to make >> sure Airflow continues to meet the need of users, and can also continue to >> evolve (where I worry that complexity of code/datamodel materially hurts >> that final point) >> >> Thoughts? >> >> [0]: https://lists.apache.org/thread/78vndnybgpp705j6sm77l1t6xbrtnt5c >> <https://lists.apache.org/thread/78vndnybgpp705j6sm77l1t6xbrtnt5c> >> [1]: >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=294816378#AIP67MultiteamdeploymentofAirflowcomponents-Whatisexcludedfromthescope<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=294816378#AIP67MultiteamdeploymentofAirflowcomponents-Whatisexcludedfromthescope> >> ? >> [2]: >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=294816378#AIP67MultiteamdeploymentofAirflowcomponents-DesignNonGoals<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=294816378#AIP67MultiteamdeploymentofAirflowcomponents-DesignNonGoals> >> [3]: >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=294816378#AIP67MultiteamdeploymentofAirflowcomponents-Whyisitneeded<https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=294816378#AIP67MultiteamdeploymentofAirflowcomponents-Whyisitneeded> >> ? >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org >> <mailto:dev-unsubscr...@airflow.apache.org> >> <mailto:dev-unsubscr...@airflow.apache.org> >> For additional commands, e-mail: dev-h...@airflow.apache.org >> <mailto:dev-h...@airflow.apache.org> <mailto:dev-h...@airflow.apache.org> >> >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@airflow.apache.org > <mailto:dev-unsubscr...@airflow.apache.org> > For additional commands, e-mail: dev-h...@airflow.apache.org > <mailto:dev-h...@airflow.apache.org>