Dear Community, I would like to revive this discussion regarding the potential donation of Dremio's Auth Manager.
Over the past few days, I have explored the suggestion of dividing the contribution into smaller parts. I am pleased to report that I have successfully broken down the features into approximately 15 pull requests, targeting the main Iceberg repository. While these pull requests are all rather substantial, I think that they remain within a manageable size for reviewers. Would this approach be a good path forward? If so, I can share more details about the timeline and roadmap I have in mind, and of course, I am prepared to begin the donation as soon as I have the Community's green light. Thanks, Alex Dutra On Wed, Jun 25, 2025 at 9:57 AM Alex Dutra <alex.du...@dremio.com> wrote: > Hi Daniel, hi all, > > Sorry for the late reply. Here are some answers to your questions: > > > I was under the impression that the AuthManager implementation was > relatively small (based on the recent work for the GCP AuthManager) > > These are not comparable. The GCP AuthManager is small because it only > works for GCP, and thus can leverage Google auth libraries (more > specifically, it uses the google-auth-library-oauth2-http artifact; > and since this artifact is already a required dependency for > iceberg-gcp, it doesn't bring in any extra dependency). > > Conversely, this AuthManager is a general-purpose AuthManager that can > work with any IDP. > > > The broader community wasn't involved in decisions made about the > implementation > > That’s exactly the purpose of this donation. > > > "impersonation flow" which I'm not familiar with > > This is a feature where the manager can dynamically fetch the subject > token for a token exchange, thus managing both the catalog's token and > the user's token, facilitating impersonation (and delegation) use > cases. Hence the name (admittedly a bit confusing). This feature is > still evolving, but we received positive feedback from users and we > believe it brings a lot of value – and is not something that a > third-party library could do. > > > we need to break it into smaller contributions and figure out the > appropriate way to review and assimilate the functionality > > While we are open to this option, we are concerned about the potential > duration of its completion. In the interim, users have expressed a > need for improved OAuth2 support. Would it be possible to gain some > clarity regarding the timeline for a review of this initiative? > Perhaps an initial review of the current codebase could help identify > and address any potential roadblocks? I can also schedule a demo of > the new auth manager, if that helps. > > > how well the community understands the behaviors. > > While OAuth2 may not be familiar or palatable to most Iceberg > contributors, I am confident that some of them possess the expertise > to effectively review and assess the donation. > > > The main competency of this project isn't to implement security protocols > > This may be true for the GCP auth manager or for the SigV4 one – these > are vendor-specific and can leverage the respective vendor's SDK. But > how would we support OAuth2 in a generic way otherwise? Or Kerberos? > Whether this is a competency of the project or not is debatable. > Managing HTTP requests is not a main competency of this project > either, and yet we have one RESTClient interface and one HTTPClient > implementation, and lots of JSON parsers. > > The RESTClient in its current form already implies using some > authentication protocol. The simple case of using static (provided via > configuration) tokens does not cover real-world cases that users have > expressed interest in. Accepting the Auth Manager will certainly > require some extra attention to security protocols from Iceberg > maintainers, but it will allow the project to support more advanced > use cases. Additionally, the Auth Manager provides a path for users of > the existing, deprecated “/token” endpoint to migrate to standard > RFC-based OAuth flows. > > > Was there any exploration of leveraging other standard implementations > like Apache Oltu, Nimbus, etc. to build the implementation off of? > > Yes, we considered that and decided not to go down that route. For a > few reasons: > > 1. Most OAuth libraries provide building blocks to create clients, but > they are not fully-fledged clients; you still need to write code in > order to glue things together [1]. > > 2. These libraries usually have (too?) many dependencies [2]; some of > them have not been maintained for a while. And Apache Oltu is retired. > In contrast, our Auth Manager only has one small dependency: > auth0-jwt. > > 3. If you delegate to a third-party library, then you cannot share the > catalog's RESTClient or Executor. The library is going to maintain its > own HTTP client and executor, leading to increased resource > consumption. > > 4. Nothing precludes us from switching to a third-party library later > on (it's an implementation detail). We thought it's best to start with > a self-contained project. > > Thanks, > Alex > > [1]: > https://connect2id.com/products/nimbus-oauth-openid-connect-sdk/guides/oauth-client-server-development > [2] For Nimbus: > > https://central.sonatype.com/artifact/com.nimbusds/oauth2-oidc-sdk/11.26/dependencies > > On Thu, Jun 19, 2025 at 5:58 PM Daniel Weeks <dwe...@apache.org> wrote: > > > > I hadn't seen this thread before we discussed it yesterday, but since > then I've taken a look and have some reservations. > > > > I was under the impression that the AuthManager implementation was > relatively small (based on the recent work for the GCP AuthManager), but > after taking a look at the repo, this is far from a small contribution. > > > > I strongly support more robust security support (especially for > OAuth2/OIDC), but I don't feel this is going to be a small effort to > introduce. The broader community wasn't involved in decisions made about > the implementation and I see elements that give me pause (like > "impersonation flow" which I'm not familiar with and implementation details > like extensions to immutables that aren't consistent with the broader > codebase). > > > > If we decide that we want to take this on, I feel like we need to break > it into smaller contributions and figure out the appropriate way to review > and assimilate the functionality in a way that's consistent with the rest > of the project. Due to this being security related, we should take extra > precautions around what this introduces and how well the community > understands the behaviors. > > > > However, looking at the complexity here relative to the approach with > the GCP, I have to question whether this is the right path overall. The > main competency of this project isn't to implement security protocols, so > it's a lot to say we want a full and complete (possibly with extensions) > native implementation of the OAuth2 specification (there are whole projects > built around that alone). > > > > Was there any exploration of leveraging other standard implementations > like Apache Oltu, Nimbus, etc. to build the implementation off of? > > > > -Dan > > > > On Thu, Jun 19, 2025 at 5:33 AM Alex Dutra <alex.du...@dremio.com.invalid> > wrote: > >> > >> Hi Ryan & JB, hi all, > >> > >> I think it would be easier to introduce this new manager as an > >> alternative manager. This would make the migration smoother as it > >> would give users time to migrate at their convenience. Besides, the > >> new manager has the notion of "dialects", and can be configured to > >> behave exactly like the current one (honoring the same config > >> options), making the migration even easier. > >> > >> > Why not contribute the functionality directly to the AuthManager > already in Iceberg? Is this incompatible or is there a reason the current > one can't be extended through contributions? > >> > >> There are a few reasons why I believe it's not possible to extend the > >> current manager indefinitely: > >> > >> 1. The current auth manager lives in iceberg-core; as we introduce > >> more features, it will become impractical to keep it there, especially > >> since some of the features will require third-party dependencies. As a > >> data point: the new manager contains almost 100 Java production > >> classes (not counting test classes and build scripts). > >> 2. The current auth manager has some well known shortcomings, notably > >> around token refreshes. It's not possible to fix that without > >> introducing regressions and potentially breaking many catalog clients > >> already in production. > >> 3. As we introduce features like Authorization Code grant support, > >> interactions with the IDP will become more complex than just a > >> request-response cycle. Since most of the current logic resides in the > >> OAuth2Util class, which is entirely public, it won't be an easy task > >> to introduce support for such complex flows while avoiding binary > >> incompatibilities. > >> > >> Thanks, > >> Alex > >> > >> > >> On Wed, Jun 18, 2025 at 11:35 PM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > >> > > >> > Hi > >> > > >> > I think it makes sense to directly add in AuthManager. I don't see > >> > blockers (with some adaptations). Alex ? > >> > > >> > From a donation process standpoint (if accepted), I'm happy to help > >> > with the SGA and IP Clearance. > >> > > >> > Regards > >> > JB > >> > > >> > On Wed, Jun 18, 2025 at 9:15 PM Ryan Blue <rdb...@gmail.com> wrote: > >> > > > >> > > I think it would be great to bring this functionality into Iceberg. > I'm curious about your plan for getting it in. It sounds like you're > suggesting adding the Dremio project to the Iceberg repo and making it > optional. Why not contribute the functionality directly to the AuthManager > already in Iceberg? Is this incompatible or is there a reason the current > one can't be extended through contributions? > >> > > > >> > > On Tue, Jun 17, 2025 at 11:23 AM Christian Thiel < > christian.t.b...@gmail.com> wrote: > >> > >> > >> > >> Hey Alex, > >> > >> > >> > >> Thanks for the Initiative — I really appreciate the effort here! > >> > >> > >> > >> Having good auth compatibility in the Catalog ecosystem is key to > establish secure standards by making them easy to use. While Iceberg should > stay open to other means of Authentication, OAuth2 is the most widely > adopted interoperable auth standard, and its role in Iceberg REST reflects > that. But with human-centric flows like Auth Code (with PKCE 😉) and Device > Code missing from most standard clients, users often default to handing out > personal Client ID/secret pairs—which is really bad from a security > perspective. > >> > >> > >> > >> While I can’t speak to the Java details, I fully support bringing > the functionality into Iceberg. I have tested the proposed code > successfully with Spark and different IdPs, including Auth & Device Code > flows with token refresh, as well as token refresh for Client Credential > flows. > >> > >> > >> > >> Thanks! > >> > >> > >> > >> Christian > >> > >> > >> > >> > >> > >> > >> > >> On Mon, 16 Jun 2025 at 20:33, Alex Dutra > <alex.du...@dremio.com.invalid> wrote: > >> > >>> > >> > >>> Hi all, > >> > >>> > >> > >>> Dremio recently open-sourced a new implementation of the Auth > Manager > >> > >>> API for OAuth2: > >> > >>> > >> > >>> https://github.com/dremio/iceberg-auth-manager > >> > >>> > >> > >>> I wrote a blog post about it a while ago [1]. > >> > >>> > >> > >>> Built on top of the Auth Manager API introduced in Iceberg 1.9.0, > this > >> > >>> project provides a more flexible and extensible OAuth2 manager > >> > >>> compared to the built-in equivalent in Iceberg Core. It follows > OAuth2 > >> > >>> standards strictly, but also provides compatibility with any > existing > >> > >>> Apache Iceberg REST catalog, and contains no Dremio-specific > >> > >>> functionality. To date, this is the only OAuth2 manager fully > >> > >>> compliant with external identity providers. > >> > >>> > >> > >>> Dremio would like to contribute this code to the Apache Iceberg > >> > >>> project. I am therefore initiating this discussion to determine > the > >> > >>> community's interest in accepting this donation. > >> > >>> > >> > >>> This project is beneficial to the community because it addresses > >> > >>> well-known limitations, such as token refresh problems [2][3][4], > and > >> > >>> also because it introduces highly anticipated features like the > >> > >>> Authorization Code grant support [5]. Fixing these limitations or > >> > >>> adding support for such large features in the built-in manager, > while > >> > >>> avoiding any risk of regressions, would have been a lot harder. > >> > >>> > >> > >>> Also worth mentioning: this project adheres to the "Iceberg OAuth2 > >> > >>> Client Authentication Guide", proposed by Christian Thiel [6]. > >> > >>> > >> > >>> This project could initially serve as a runtime-selectable > alternative > >> > >>> to the current built-in implementation. Upon reaching sufficient > >> > >>> maturity however, it could potentially replace the existing > manager. > >> > >>> > >> > >>> Please share your thoughts by replying to this email. > Alternatively, > >> > >>> we can discuss this topic at the Catalog Sync meeting this > Wednesday, > >> > >>> June 18th, if that is a more comfortable option to everyone. > >> > >>> > >> > >>> Thanks, > >> > >>> > >> > >>> Alex > >> > >>> > >> > >>> [1] > https://medium.com/data-engineering-with-dremio/introducing-dremio-auth-manager-for-apache-iceberg-223827342d19 > >> > >>> [2]: https://github.com/apache/iceberg/issues/12196 > >> > >>> [3]: https://github.com/apache/iceberg/issues/12363 > >> > >>> [4]: https://github.com/apache/iceberg/issues/13030 > >> > >>> [5]: https://github.com/apache/iceberg/issues/10677 > >> > >>> [6]: > https://docs.google.com/document/d/1buW9PCNoHPeP7Br5_vZRTU-_3TExwLx6bs075gi94xc/edit?tab=t.0#heading=h.hufqidg1ij89 >