I very much like Option 1 only

Especially if we can generate a python client that can easily "hide" the
necessary auth workflow and extend it by different mechanisms easily. I
think the fact that the JWT token is used should be very well hidden - and
you should be able to have some pluginable way of extending the Python
client to use different authentication to get the token. I do not know how
the new generators work, but imagine they would work like that.

One of the issues with the past generated client was that it had to have
the set of available authentications specified when you generated the
client. In our case it meant that the client we published had a fixed set
of auth mechanisms and you had to regenerate the client with a new auth
method - you could not "plug-in" the new authentication into the PyPI
package. You had to build your own package.

If the JWT auth mechanism will allow us to generate client that will have a
pluginable mechanism to retrieve the token and authenticate (which in this
case should be absolutely possible) - then for me that would be a step-up
from the current approach and even be a good argument on it's own why we
should do it.

Of course it has some drawbacks for example for debugging and where you
want to do your API calls "by hand" - you cannot just slap your HEADERs on
every request you have to do some logic and retrieve the JWT token first.

So Option 1 only would be my preference.


On Fri, Feb 28, 2025 at 7:17 PM Beck, Vincent <vincb...@amazon.com.invalid>
wrote:

> Hi everyone,
>
> I would like to talk about auth backends.
>
> In Airflow 2, there are multiple options for authenticating REST API
> calls. These options are called auth backends (
> https://airflow.apache.org/docs/apache-airflow-providers-fab/stable/auth-manager/api-authentication.html).
> The deployment manager configures which authentication mechanism is used
> for REST API calls. There are several options:
> - session
> - basic_auth
> - Kerberos
> - Google OpenID
>
> For example, if the deployment manager sets `auth_backends =
> airflow.providers.fab.auth_manager.api.auth.backend.basic_auth`, then users
> must authenticate their Rest API calls using basic authentication
> (username/password). Note: Multiple auth backends can be configured (e.g.
> `auth_backends =
> airflow.providers.fab.auth_manager.api.auth.backend.basic_auth,airflow.providers.fab.auth_manager.api.auth.backend.session`).
>
> Most (if not all) of these auth backends are provided by providers. They
> follow the same interface and must define two functions:
> - `init_app`. To initialize resources if needed
> - `requires_authentication`. Checks whether the authentication provided in
> the request is valid. For example, `basic_auth` extracts authentication
> information from the Flask request and checks whether the username/password
> provided are valid. If the authentication succeeds then the user is saved
> in session so that it can be used subsequently by the API itself.
>
> In Airflow 3, all APIs (but here we will focus only on the public API) use
> JWT tokens for authentication. Every API request must include a valid JWT
> token to be authenticated. This is not an issue when using the UI since the
> UI manages JWT tokens automatically. However, what happens when users call
> the public API directly?
>
> I see three possible options:
>
> Option 1. Deprecate auth backends and introduce an API to generate JWT
> tokens. This approach aligns with how modern web applications handle
> authentication. It would simplify authentication in Airflow by enforcing a
> single strategy: JWT-based authentication. User flow:
> - Call an API to generate a JWT token, providing authentication details
> such as a username and password
> - If authentication succeeds, the API returns a JWT token
> - Use this JWT token to authenticate public API calls
>
> The API for generating JWT tokens would be provided by the auth manager.
> The Simple auth manager already supports this, and Bugra is working on
> adding it to the FAB auth manager (PR:
> https://github.com/apache/airflow/pull/47043).
>
> This is personally my preferred solution but there are some caveats:
> - Users will need to update their authentication methods. However, since
> Airflow 3 already introduces breaking changes, users will need to adjust
> their integrations regardless.
> - Some authentication strategy would no longer be possible such as Google
> OpenID. Since authentication shifts from the auth backend to the auth
> manager, providers without an auth manager (like Google) would lose their
> auth backends. More explanations on this: because the API to create a JWT
> token is provided by the auth manager, a provider could support all the
> different authentication mechanisms provided by its auth backends through
> this API. Example: The FAB provider currently supports basic_auth and
> Kerberos as auth backends. In Airflow 3, the FAB auth manager could support
> both authentication mechanisms for generating JWT tokens.
>
> Option 2. Update auth backends to be compatible with Airflow 3. To support
> Airflow 3, we would need to:
> - Modify auth backends to use Fastapi instead of Flask
> - Update the `requires_authentication` function, since it currently
> validates authentication and stores the user in a session—an approach
> incompatible with Airflow 3.
> - Ensure compatibility with both Airflow 2 and Airflow 3, meaning we would
> have two implementations for each auth backend (one for AF2, one for AF3).
>
> Pros:
> - Backward compatibility: Users can continue using existing authentication
> methods across Airflow 2 and Airflow 3.
>
> Cons:
> - Increased maintenance complexity due to dual implementations.
>
> Option 3. Support both Option 1 and Option 2. This would give users the
> flexibility to choose their preferred authentication method.
>
> What do you think?
>
> Vincent
>

Reply via email to