And to Kaxil's mail: yep. What you wrote is exactly what I understood
needs to be done.

On Fri, Feb 17, 2023 at 2:40 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
> > Understood. I like the idea of extensibility and "Airflow as a platform." 
> > However, we should make sure that we do not worsen the user experience with 
> > the extensibility. The "User Management Provider" is something that could 
> > potentially make the user experience worse, especially for customers who 
> > are self-hosting Airflow. Managed services will ensure that they dedicate 
> > resources to maintaining their user management providers. Multi-tenancy 
> > will end up becoming a feature for managed service customers, leaving the 
> > 74% of Airflow users [1] with a less powerful Airflow. As an example, 
> > Timetables is a very powerful feature, which, anecdotally, no customer ends 
> > up using due to its complexity.
>
> I do not think this will happen. I think part of the effort should not
> only implement the API but also to provide a fully fledged (though
> simple) implementation of such a provider which works with an
> open-source implementation of identity - KeyCloak is one that comes to
> my mind. It's possibly jumping ahead a bit to say "let's use KeyCloak
> as reference provider we can release", but I think KeyCloak has all we
> need:
> * integration with mutliple authentication providers and protocols
> * User Management:
> https://wjw465150.gitbooks.io/keycloak-documentation/content/server_admin/topics/users/viewing.html
> * Role Mangement including user mapping:
> https://wjw465150.gitbooks.io/keycloak-documentation/content/server_admin/topics/roles/user-role-mappings.html
> * Group management:
> https://wjw465150.gitbooks.io/keycloak-documentation/content/server_admin/topics/groups/groups-vs-roles.html
>
> It comes with a management console, CLI and much more
> (auditing/session management etc. etc.)
>
> In a way it would be simply providing very much the same what FAB
> Security Manager does, but with much more complete scope and - most
> importantly - it would not be "part of Airflow as FAB is", it would be
> "outside" of it and the only thing Airflow would provide is merely
> pointers to the Docs of Keycloak on how to integrate it with Airflow
> as a proxy: 
> https://wjw465150.gitbooks.io/keycloak-documentation/content/server_installation/topics/proxy.html
> (or it could be done by writing Airflow KeyCloak Adapter - to be
> decided what would be easier to maintain).The users will be free to
> configure KeyCloak proxy as they see fit. No DB needed in Airflow to
> manage any of those, no UI, no API, no CLI - all that delegated out
> and integrated via incoming headers or adapter.
>
> The users will have several choices:
>
> 1) For existing users/those who want to keep all "in-airflow-ui"  they
> could use FAB Provider (which will be separated from the Core). Same
> as today, but without the advanced management features for groups and
> tenants. We might consider dropping that altogether eventually.
> 2) If they are on premise - they can use KeyCloak Provider - by
> following our advice/suggestions/simple guidelines on how to
> integrate. They would have to manage their own KeyCloak instance (it
> won't be a "standard" part of Airflow).
> 3) If the user runs on AWS/Azure/GCP/others - each cloud  would
> (hopefully) develop their own provider to integrate with IAM etc - >
> they could use that provider directly. Or they could use and manage
> their KeyCloak in the cloud as they see fit (it supports all the
> clouds Oauth integration). Or develop their own provider.
> 4) Those on managed services will have no choice but to use the
> provider installed by the Service of theirs
>
> I think that all gives the user the choice - if they want to go role
> management and multi-tenant capabilities, fine but they will have to
> mange the users outside of Airflow and integrate Airflow with it (and
> they can either integrate with what they have already or use
> KeyCloak). And does not really impair them.
>
> J,
>
>
> On Thu, Feb 16, 2023 at 6:27 AM Mehta, Shubham
> <shu...@amazon.com.invalid> wrote:
> >
> > Thanks, Kaxil – that helped to clarify the proposal a bit more.
> >
> > > Replacing Access Control provided by FAB with a base/core security model 
> > > (that is still resource-based)
> >
> > Are you suggesting that we build this resource-driven security model 
> > directly into Airflow, without relying on external dependencies like FAB?
> >
> > > Extend this to the other Airflow components (scheduler, workers, 
> > > triggered, cli)
> >
> > Are there cases where the scheduler or CLI would require the authorization 
> > API? Since they are considered trusted components, I assumed they would not 
> > need it.
> >
> >
> > Jarek - as always, I appreciate you sharing your thoughts and having an 
> > open discussion.
> >
> > > Which really explains what "Airflow as a Platform" is all about. I do not 
> > > think we already know all the parts that should be converted into 
> > > "Airflow extendability". It's more of an incremental effort like that 
> > > where we have those bright ideas "Hey - this part can be removed and 
> > > delegated to others".  I think this has never been formulated explicitly 
> > > but I think for quite a while we are really in the mode where we think 
> > > much more about what we can SPLIT OUT from Airflow rather than what we 
> > > can ADD to Airflow.
> >
> > Understood. I like the idea of extensibility and "Airflow as a platform." 
> > However, we should make sure that we do not worsen the user experience with 
> > the extensibility. The "User Management Provider" is something that could 
> > potentially make the user experience worse, especially for customers who 
> > are self-hosting Airflow. Managed services will ensure that they dedicate 
> > resources to maintaining their user management providers. Multi-tenancy 
> > will end up becoming a feature for managed service customers, leaving the 
> > 74% of Airflow users [1] with a less powerful Airflow. As an example, 
> > Timetables is a very powerful feature, which, anecdotally, no customer ends 
> > up using due to its complexity.
> >
> > I am still unclear about other user scenarios related to user management, 
> > besides multi-tenancy, that Airflow customers are looking to enable. While 
> > the extensibility we aim for will enable this, is there a need for it? 
> > Also, @Google-folks, @Astronomer-folks, @Azure-folks, et al. - are you 
> > interested in building a custom user management provider that works with 
> > your platform? Have there been cases where your customers were limited by 
> > the current permissioning model, and you considered replacing FAB?
> >
> > I believe that the primary motivation for "user management provider" is 
> > driven by the excitement around getting rid of FAB, which I think we can 
> > still achieve while including multi-tenancy in the core Airflow. Both 
> > should be treated as separate problems.
> >
> > References:
> > 1. 
> > https://airflow.apache.org/blog/airflow-survey-2022/#how-do-you-deploy-airflow-multiple-choice
> >
> > On 2023-02-14, 12:44 PM, "Jarek Potiuk" <ja...@potiuk.com> wrote:
> >
> >     CAUTION: This email originated from outside of the organization. Do not 
> > click links or open attachments unless you can confirm the sender and know 
> > the content is safe.
> >
> >
> >
> >     Comment to Subham's question:
> >
> >     > In addition, are there any other user scenarios, beyond 
> > multi-tenancy, that Airflow users are looking to enable and that require 
> > this pluggability? Asking as I haven't come across them. Overall, I believe 
> > we need more information on your proposal before seeking feedback from the 
> > community. Could we work together during February to develop a concrete 
> > proposal?
> >
> >     I am glad you asked. I think, this is one of the  what I wanted to
> >     achieve by adding this page
> >     
> > https://github.com/apache/airflow/blob/main/docs/apache-airflow/public-airflow-interface.rst
> >     - it will be live in 2.6 and one of the main parts is this one:
> >
> >     
> > https://github.com/apache/airflow/blob/main/docs/apache-airflow/public-airflow-interface.rst#using-public-interface-to-extend-airflow-capabilities
> >
> >     Which really explains what "Airflow as a Platform" is all about. I do
> >     not think we already know all the parts that should be converted into
> >     "Airflow extendability". It's more of an incremental effort like that
> >     where we have those bright ideas "Hey - this part can be removed and
> >     delegated to others".  I think this has never been formulated
> >     explicitly but I think for quite a while we are really in the mode
> >     where we think much more about what we can SPLIT OUT from Airflow
> >     rather than what we can ADD to Airflow.
> >
> >     When you look at it, this is also the main idea behind Open Lineage
> >     integration for example - we are adding open linage (which is really
> >     just an API) so that others can build "everything-lineage" on top of
> >     it. So we are adding a minimum-possible set of APIs and integration so
> >     that we can expose the lineage capability so that all the lineage "UI"
> >     and other use cases that lineage exposes would be done outside. We are
> >     in a strong position to do it - being sure that when we expose it,
> >     others will implement the integration they care about.
> >
> >     I think more and more (and It has been preached by Ash mostly, but
> >     also others) that we should be focusing solely on being an extremely
> >     powerful and robust scheduler and make sure we are exposing all of the
> >     possible things that can be exposed as an external API (while still
> >     providing basic implementation that makes airflow still a "finished"
> >     product that can be used to handle basic cases.
> >
> >     BTW. We are now preparing for the Airflow Summit CFP (some
> >     announcements will follow shortly, I do not want to spill too many
> >     beans) and we have a very interesting broad category "Airflow and
> >     ...." . And I think we should work in the direction that the `...` is
> >     far bigger than Airflow itself.
> >
> >     J.
> >
> >     On Tue, Feb 14, 2023 at 12:34 PM Kaxil Naik <kaxiln...@gmail.com> wrote:
> >     >
> >     > Great idea Vikram, I love the idea of making this a 
> > provider/pluggable.
> >     >
> >     > In some ways, we already have a pluggable mechanism for 
> > Authentication with Auth Backends [1]. Where we will need lot more work I 
> > think is:
> >     >
> >     > Replacing Access Control provided by FAB with a base/core security 
> > model (that is still resource-based) [2]
> >     > Extend this to the other Airflow components (scheduler, workers, 
> > triggered, cli) or make them all driven by a single API that takes care of 
> > Auth. This will also reduce a lot of duplication of code across many of the 
> > components
> >     > For backwards compact, we could ship with FAB-provider that still 
> > uses Flask-app builder in addition to our recommended provider that will 
> > have more features and users/companies/stabkeholders can build on top of 
> > that provider to extend it further.
> >     >
> >     >
> >     > References:
> >     > [1]: 
> > https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#auth-backends
> >     > [2]: 
> > https://airflow.apache.org/docs/apache-airflow/stable/administration-and-deployment/security/access-control.html
> >     >
> >     > On Tue, 14 Feb 2023 at 02:06, Mehta, Shubham 
> > <shu...@amazon.com.invalid> wrote:
> >     >>
> >     >> Hi Vikram,
> >     >> Thank you for taking the time to review the proposal. I appreciate 
> > your insights — I will make sure to reach out to you directly in the future 
> > for feedback as that would've undoubtedly saved us some time and effort.
> >     >>
> >     >> In regards to the separation of user management, I understand your 
> > concerns and, on a high-level, I agree with you. However, I think it would 
> > be beneficial to have more details on how it will work. Here are a few 
> > questions that come to mind:
> >     >> 1. How will the user-id/group-id interface interact with Airflow 
> > resource-level permissions? What parts of "John can-edit dag1 and can-view 
> > dag2" be part of Airflow core? What will be exposed to the external system?
> >     >> 2. Who will be responsible for managing the resource-level 
> > permissions? Will it be the external system?
> >     >> 3. What are the limitations of this new pluggable model compared to 
> > FAB? Will there be restrictions on the granularity of resource access that 
> > Airflow admins can provide to their users?
> >     >> 4. As Jarek pointed out, with this change we want to make 
> > authorization externally driven. Will this have a significant impact on 
> > Airflow performance as authorization will be required for fetching 
> > variables, executing tasks, etc.?
> >     >> 5. What will the migration process look like for existing users to 
> > this non-FAB pluggable model?
> >     >>
> >     >> In addition, are there any other user scenarios, beyond 
> > multi-tenancy, that Airflow users are looking to enable and that require 
> > this pluggability? Asking as I haven't come across them. Overall, I believe 
> > we need more information on your proposal before seeking feedback from the 
> > community. Could we work together during February to develop a concrete 
> > proposal?
> >     >>
> >     >> Beside this, I would like to propose that we define the scope and 
> > long-term vision of "Airflow core". To achieve this, it may be helpful to 
> > first outline the perspectives of the Airflow PMCs. Recently, there have 
> > been discussions regarding the separation of executors into a separate 
> > package, the implementation of pluggable schedulers, and other related 
> > topics. Currently, these decisions and discussions are somewhat ad hoc and 
> > are made through the mailing list. I would be happy to collaborate and 
> > invest time in this effort.
> >     >>
> >     >> Regards
> >     >> Shubham
> >     >>
> >     >> On 2023-02-13, 11:04 AM, "Jarek Potiuk" <ja...@potiuk.com> wrote:
> >     >>
> >     >>     CAUTION: This email originated from outside of the organization. 
> > Do not click links or open attachments unless you can confirm the sender 
> > and know the content is safe.
> >     >>
> >     >>
> >     >>
> >     >>     Hey Vikram,
> >     >>
> >     >>     I think it's brilliant and I wonder how it happened that had not
> >     >>     occurred to us earlier. And I believe that is due to the natural
> >     >>     tendency of "following as we always did" rather than thinking
> >     >>     completely out-of-the-box. Thanks Vikram for bringing it up.
> >     >>
> >     >>     The funny thing is that when I see this:
> >     >>
> >     >>     > However, I don't agree that this level of user management 
> > belongs in "Core Airflow".
> >     >>
> >     >>     I almost immediately think - NOOOOO, why, it's always been here, 
> > how
> >     >>     can we remove it?
> >     >>
> >     >>     But then if you look a bit closer:
> >     >>
> >     >>     > think this is a time to consider the concept of a "user 
> > management provider" with a simple built-in implementation being the 
> > current Airflow functionality, enabling alternate more complex (but 
> > separate) implementations such as your proposal here as alternate user 
> > management providers.
> >     >>
> >     >>     Then it starts to make way more sense. Way more.
> >     >>
> >     >>     And when you look further:
> >     >>
> >     >>     >  Maybe, this also enables us to get rid of the Fab security 
> > manager from core Airflow?
> >     >>
> >     >>     My heart jumps and I am immediately sold on the idea.
> >     >>
> >     >>     When I was commenting on the doc  initially, something was not 
> > right.
> >     >>     I had a feeling It is probably the 5th time I am looking and
> >     >>     commenting on a similar document. And, well, I did, actually. 
> > Most of
> >     >>     the things we discussed there are already implemented out there. 
> > We
> >     >>     just need to make sure we expose enough of the API to use them. 
> > For
> >     >>     example we have Keycloak that is an open source implementation of
> >     >>     Identity and Access Management. With everything out there already
> >     >>     integrated. and I've been part of the project that integrated 
> > just the
> >     >>     authentication part. Now if we rethink the authorization and 
> > make it
> >     >>     simpler and "externally driven", this will not only be faster 
> > IMHO,
> >     >>     but also will allow enterprise users to integrate much better.
> >     >>
> >     >>     I believe following the path that Vikram outlined will be a good
> >     >>     direction for everyone in the community - including all the 
> > Manage
> >     >>     Service providers, who will have a far easier job on integrating
> >     >>     Airflow into their authentication models.
> >     >>
> >     >>     J.
> >     >>
> >     >>
> >     >>
> >     >>     On Mon, Feb 13, 2023 at 6:24 PM Vikram Koka
> >     >>     <vik...@astronomer.io.invalid> wrote:
> >     >>     >
> >     >>     > Shubham and Vincent,
> >     >>     >
> >     >>     > Let me start by saying that I apologize for my delayed 
> > response to your original email.
> >     >>     >
> >     >>     > I appreciate the detailed write-up and the thought behind it. 
> > I completely agree with your use case and understand how this is applicable 
> > to enterprises with multiple data teams using Airflow.
> >     >>     >
> >     >>     > However, I don't agree that this level of user management 
> > belongs in "Core Airflow".
> >     >>     >
> >     >>     > I strongly believe that the core Airflow mission is for the 
> > community at large and for data practitioners either individuals or teams 
> > within enterprises. And therefore, I don't disagree with the intent of 
> > making it easier for enterprise teams to adopt Airflow. But, I think there 
> > is a never ending list of user management features which are needed to 
> > support Enterprise needs. We have already struggled with this over time and 
> > faced challenges with the Fab security manager and its integration in 
> > Airflow.
> >     >>     >
> >     >>     > I think we should use this opportunity and your use case to 
> > "separate the user management" from Core Airflow outside of the absolute 
> > basics. I think this is a time to consider the concept of a "user 
> > management provider" with a simple built-in implementation being the 
> > current Airflow functionality, enabling alternate more complex (but 
> > separate) implementations such as your proposal here as alternate user 
> > management providers. Maybe, this also enables us to get rid of the Fab 
> > security manager from core Airflow?
> >     >>     >
> >     >>     > Best regards,
> >     >>     > Vikram
> >     >>     >
> >     >>     >
> >     >>     > On Fri, Feb 3, 2023 at 8:22 AM Beck, Vincent 
> > <vincb...@amazon.com.invalid> wrote:
> >     >>     >>
> >     >>     >> Thanks __
> >     >>     >>
> >     >>     >> On 2023-02-03, 10:55 AM, "Jarek Potiuk" <ja...@potiuk.com> 
> > wrote:
> >     >>     >>
> >     >>     >>     CAUTION: This email originated from outside of the 
> > organization. Do not click links or open attachments unless you can confirm 
> > the sender and know the content is safe.
> >     >>     >>
> >     >>     >>
> >     >>     >>
> >     >>     >>     Added.
> >     >>     >>
> >     >>     >>     On Fri, Feb 3, 2023 at 3:53 PM Beck, Vincent
> >     >>     >>     <vincb...@amazon.com.invalid> wrote:
> >     >>     >>     >
> >     >>     >>     > Thank you! 
> > https://cwiki.apache.org/confluence/display/~vin100.beck
> >     >>     >>     >
> >     >>     >>     > On 2023-02-02, 5:38 PM, "Jarek Potiuk" 
> > <ja...@potiuk.com> wrote:
> >     >>     >>     >
> >     >>     >>     >     CAUTION: This email originated from outside of the 
> > organization. Do not click links or open attachments unless you can confirm 
> > the sender and know the content is safe.
> >     >>     >>     >
> >     >>     >>     >
> >     >>     >>     >
> >     >>     >>     >     What's your cwiki ID, Vincent (I'll add you without 
> > going into details yet)
> >     >>     >>     >
> >     >>     >>
> >     >>
> >

Reply via email to