Dear Community,

I would like to revive this discussion regarding the potential donation of
Dremio's Auth Manager.

Over the past few days, I have explored the suggestion of dividing the
contribution into smaller parts. I am pleased to report that I have
successfully broken down the features into approximately 15 pull
requests, targeting the main Iceberg repository.

While these pull requests are all rather substantial, I think that they
remain within a manageable size for reviewers.

Would this approach be a good path forward? If so, I can share more details
about the timeline and roadmap I have in mind, and of course, I am prepared
to begin the donation as soon as I have the Community's green light.

Thanks,
Alex Dutra


On Wed, Jun 25, 2025 at 9:57 AM Alex Dutra <alex.du...@dremio.com> wrote:

> Hi Daniel, hi all,
>
> Sorry for the late reply. Here are some answers to your questions:
>
> > I was under the impression that the AuthManager implementation was
> relatively small (based on the recent work for the GCP AuthManager)
>
> These are not comparable. The GCP AuthManager is small because it only
> works for GCP, and thus can leverage Google auth libraries (more
> specifically, it uses the google-auth-library-oauth2-http artifact;
> and since this artifact is already a required dependency for
> iceberg-gcp, it doesn't bring in any extra dependency).
>
> Conversely, this AuthManager is a general-purpose AuthManager that can
> work with any IDP.
>
> > The broader community wasn't involved in decisions made about the
> implementation
>
> That’s exactly the purpose of this donation.
>
> > "impersonation flow" which I'm not familiar with
>
> This is a feature where the manager can dynamically fetch the subject
> token for a token exchange, thus managing both the catalog's token and
> the user's token, facilitating impersonation (and delegation) use
> cases. Hence the name (admittedly a bit confusing). This feature is
> still evolving, but we received positive feedback from users and we
> believe it brings a lot of value – and is not something that a
> third-party library could do.
>
> > we need to break it into smaller contributions and figure out the
> appropriate way to review and assimilate the functionality
>
> While we are open to this option, we are concerned about the potential
> duration of its completion. In the interim, users have expressed a
> need for improved OAuth2 support. Would it be possible to gain some
> clarity regarding the timeline for a review of this initiative?
> Perhaps an initial review of the current codebase could help identify
> and address any potential roadblocks? I can also schedule a demo of
> the new auth manager, if that helps.
>
> > how well the community understands the behaviors.
>
> While OAuth2 may not be familiar or palatable to most Iceberg
> contributors, I am confident that some of them possess the expertise
> to effectively review and assess the donation.
>
> > The main competency of this project isn't to implement security protocols
>
> This may be true for the GCP auth manager or for the SigV4 one – these
> are vendor-specific and can leverage the respective vendor's SDK. But
> how would we support OAuth2 in a generic way otherwise? Or Kerberos?
> Whether this is a competency of the project or not is debatable.
> Managing HTTP requests is not a main competency of this project
> either, and yet we have one RESTClient interface and one HTTPClient
> implementation, and lots of JSON parsers.
>
> The RESTClient in its current form already implies using some
> authentication protocol. The simple case of using static (provided via
> configuration) tokens does not cover real-world cases that users have
> expressed interest in. Accepting the Auth Manager will certainly
> require some extra attention to security protocols from Iceberg
> maintainers, but it will allow the project to support more advanced
> use cases. Additionally, the Auth Manager provides a path for users of
> the existing, deprecated “/token” endpoint to migrate to standard
> RFC-based OAuth flows.
>
> > Was there any exploration of leveraging other standard implementations
> like Apache Oltu, Nimbus, etc. to build the implementation off of?
>
> Yes, we considered that and decided not to go down that route. For a
> few reasons:
>
> 1. Most OAuth libraries provide building blocks to create clients, but
> they are not fully-fledged clients; you still need to write code in
> order to glue things together [1].
>
> 2. These libraries usually have (too?) many dependencies [2]; some of
> them have not been maintained for a while. And Apache Oltu is retired.
> In contrast, our Auth Manager only has one small dependency:
> auth0-jwt.
>
> 3. If you delegate to a third-party library, then you cannot share the
> catalog's RESTClient or Executor. The library is going to maintain its
> own HTTP client and executor, leading to increased resource
> consumption.
>
> 4. Nothing precludes us from switching to a third-party library later
> on (it's an implementation detail). We thought it's best to start with
> a self-contained project.
>
> Thanks,
> Alex
>
> [1]:
> https://connect2id.com/products/nimbus-oauth-openid-connect-sdk/guides/oauth-client-server-development
> [2] For Nimbus:
>
> https://central.sonatype.com/artifact/com.nimbusds/oauth2-oidc-sdk/11.26/dependencies
>
> On Thu, Jun 19, 2025 at 5:58 PM Daniel Weeks <dwe...@apache.org> wrote:
> >
> > I hadn't seen this thread before we discussed it yesterday, but since
> then I've taken a look and have some reservations.
> >
> > I was under the impression that the AuthManager implementation was
> relatively small (based on the recent work for the GCP AuthManager), but
> after taking a look at the repo, this is far from a small contribution.
> >
> > I strongly support more robust security support (especially for
> OAuth2/OIDC), but I don't feel this is going to be a small effort to
> introduce.  The broader community wasn't involved in decisions made about
> the implementation and I see elements that give me pause (like
> "impersonation flow" which I'm not familiar with and implementation details
> like extensions to immutables that aren't consistent with the broader
> codebase).
> >
> > If we decide that we want to take this on, I feel like we need to break
> it into smaller contributions and figure out the appropriate way to review
> and assimilate the functionality in a way that's consistent with the rest
> of the project.  Due to this being security related, we should take extra
> precautions around what this introduces and how well the community
> understands the behaviors.
> >
> > However, looking at the complexity here relative to the approach with
> the GCP, I have to question whether this is the right path overall.  The
> main competency of this project isn't to implement security protocols, so
> it's a lot to say we want a full and complete (possibly with extensions)
> native implementation of the OAuth2 specification (there are whole projects
> built around that alone).
> >
> > Was there any exploration of leveraging other standard implementations
> like Apache Oltu, Nimbus, etc. to build the implementation off of?
> >
> > -Dan
> >
> > On Thu, Jun 19, 2025 at 5:33 AM Alex Dutra <alex.du...@dremio.com.invalid>
> wrote:
> >>
> >> Hi Ryan & JB, hi all,
> >>
> >> I think it would be easier to introduce this new manager as an
> >> alternative manager. This would make the migration smoother as it
> >> would give users time to migrate at their convenience. Besides, the
> >> new manager has the notion of "dialects", and can be configured to
> >> behave exactly like the current one (honoring the same config
> >> options), making the migration even easier.
> >>
> >> > Why not contribute the functionality directly to the AuthManager
> already in Iceberg? Is this incompatible or is there a reason the current
> one can't be extended through contributions?
> >>
> >> There are a few reasons why I believe it's not possible to extend the
> >> current manager indefinitely:
> >>
> >> 1. The current auth manager lives in iceberg-core; as we introduce
> >> more features, it will become impractical to keep it there, especially
> >> since some of the features will require third-party dependencies. As a
> >> data point: the new manager contains almost 100 Java production
> >> classes (not counting test classes and build scripts).
> >> 2. The current auth manager has some well known shortcomings, notably
> >> around token refreshes. It's not possible to fix that without
> >> introducing regressions and potentially breaking many catalog clients
> >> already in production.
> >> 3. As we introduce features like Authorization Code grant support,
> >> interactions with the IDP will become more complex than just a
> >> request-response cycle. Since most of the current logic resides in the
> >> OAuth2Util class, which is entirely public, it won't be an easy task
> >> to introduce support for such complex flows while avoiding binary
> >> incompatibilities.
> >>
> >> Thanks,
> >> Alex
> >>
> >>
> >> On Wed, Jun 18, 2025 at 11:35 PM Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
> >> >
> >> > Hi
> >> >
> >> > I think it makes sense to directly add in AuthManager. I don't see
> >> > blockers (with some adaptations). Alex ?
> >> >
> >> > From a donation process standpoint (if accepted), I'm happy to help
> >> > with the SGA and IP Clearance.
> >> >
> >> > Regards
> >> > JB
> >> >
> >> > On Wed, Jun 18, 2025 at 9:15 PM Ryan Blue <rdb...@gmail.com> wrote:
> >> > >
> >> > > I think it would be great to bring this functionality into Iceberg.
> I'm curious about your plan for getting it in. It sounds like you're
> suggesting adding the Dremio project to the Iceberg repo and making it
> optional. Why not contribute the functionality directly to the AuthManager
> already in Iceberg? Is this incompatible or is there a reason the current
> one can't be extended through contributions?
> >> > >
> >> > > On Tue, Jun 17, 2025 at 11:23 AM Christian Thiel <
> christian.t.b...@gmail.com> wrote:
> >> > >>
> >> > >> Hey Alex,
> >> > >>
> >> > >> Thanks for the Initiative — I really appreciate the effort here!
> >> > >>
> >> > >> Having good auth compatibility in the Catalog ecosystem is key to
> establish secure standards by making them easy to use. While Iceberg should
> stay open to other means of Authentication, OAuth2 is the most widely
> adopted interoperable auth standard, and its role in Iceberg REST reflects
> that. But with human-centric flows like Auth Code (with PKCE 😉) and Device
> Code missing from most standard clients, users often default to handing out
> personal Client ID/secret pairs—which is really bad from a security
> perspective.
> >> > >>
> >> > >> While I can’t speak to the Java details, I fully support bringing
> the functionality into Iceberg. I have tested the proposed code
> successfully with Spark and different IdPs, including Auth & Device Code
> flows with token refresh, as well as token refresh for Client Credential
> flows.
> >> > >>
> >> > >> Thanks!
> >> > >>
> >> > >> Christian
> >> > >>
> >> > >>
> >> > >>
> >> > >> On Mon, 16 Jun 2025 at 20:33, Alex Dutra
> <alex.du...@dremio.com.invalid> wrote:
> >> > >>>
> >> > >>> Hi all,
> >> > >>>
> >> > >>> Dremio recently open-sourced a new implementation of the Auth
> Manager
> >> > >>> API for OAuth2:
> >> > >>>
> >> > >>> https://github.com/dremio/iceberg-auth-manager
> >> > >>>
> >> > >>> I wrote a blog post about it a while ago [1].
> >> > >>>
> >> > >>> Built on top of the Auth Manager API introduced in Iceberg 1.9.0,
> this
> >> > >>> project provides a more flexible and extensible OAuth2 manager
> >> > >>> compared to the built-in equivalent in Iceberg Core. It follows
> OAuth2
> >> > >>> standards strictly, but also provides compatibility with any
> existing
> >> > >>> Apache Iceberg REST catalog, and contains no Dremio-specific
> >> > >>> functionality. To date, this is the only OAuth2 manager fully
> >> > >>> compliant with external identity providers.
> >> > >>>
> >> > >>> Dremio would like to contribute this code to the Apache Iceberg
> >> > >>> project. I am therefore initiating this discussion to determine
> the
> >> > >>> community's interest in accepting this donation.
> >> > >>>
> >> > >>> This project is beneficial to the community because it addresses
> >> > >>> well-known limitations, such as token refresh problems [2][3][4],
> and
> >> > >>> also because it introduces highly anticipated features like the
> >> > >>> Authorization Code grant support [5]. Fixing these limitations or
> >> > >>> adding support for such large features in the built-in manager,
> while
> >> > >>> avoiding any risk of regressions, would have been a lot harder.
> >> > >>>
> >> > >>> Also worth mentioning: this project adheres to the "Iceberg OAuth2
> >> > >>> Client Authentication Guide", proposed by Christian Thiel [6].
> >> > >>>
> >> > >>> This project could initially serve as a runtime-selectable
> alternative
> >> > >>> to the current built-in implementation. Upon reaching sufficient
> >> > >>> maturity however, it could potentially replace the existing
> manager.
> >> > >>>
> >> > >>> Please share your thoughts by replying to this email.
> Alternatively,
> >> > >>> we can discuss this topic at the Catalog Sync meeting this
> Wednesday,
> >> > >>> June 18th, if that is a more comfortable option to everyone.
> >> > >>>
> >> > >>> Thanks,
> >> > >>>
> >> > >>> Alex
> >> > >>>
> >> > >>> [1]
> https://medium.com/data-engineering-with-dremio/introducing-dremio-auth-manager-for-apache-iceberg-223827342d19
> >> > >>> [2]: https://github.com/apache/iceberg/issues/12196
> >> > >>> [3]: https://github.com/apache/iceberg/issues/12363
> >> > >>> [4]: https://github.com/apache/iceberg/issues/13030
> >> > >>> [5]: https://github.com/apache/iceberg/issues/10677
> >> > >>> [6]:
> https://docs.google.com/document/d/1buW9PCNoHPeP7Br5_vZRTU-_3TExwLx6bs075gi94xc/edit?tab=t.0#heading=h.hufqidg1ij89
>

Reply via email to