Hi Phillip, Re: protocol:
Every change to the PR resets previous approvals. However, let this not prevent you from making changes you consider worthwhile. Approvals can be repeated. In PRs all comments / concerns should be addressed. If the original comment author indicates assent but remains dormant after that and does not comment on / approve new changes, consensus is assumed. If consensus cannot be reached in GH or on dev ML, the next step is to try an in-person discussion during one of the Community Sync calls. This is my personal interpretation of the Polaris guidelines [1] and previous practice in this project. [1] https://polaris.apache.org/community/contributing-guidelines/ Cheers, Dmitri. On Fri, Mar 6, 2026 at 5:03 AM Phillip Henry <[email protected]> wrote: > Thanks for the approval on the PR, Rulin! > > But a quick question on protocol: do I make the changes gh-yzou is asking > for then ask you to approve again? > > (This is my first contribution to Polaris so I don't know what is expected > of me.) > > Regards, > > Phillip > > > On Thu, Mar 5, 2026 at 2:35 PM Phillip Henry <[email protected]> > wrote: > > > Hi, Rulin. > > > > I've done as you've suggested with the exception of adding > > GCP_CREDENTIALS_PATH_PROPERTY and GCP_SCOPES_PROPERTY to > > GcpAuthenticationParametersDpo as they would not be used in this ticket > and > > I would recommend adding them in another ticket that uses them. > > > > Having said that, I don't feel strongly about this and am happy to add if > > you really want them in this ticket. > > > > Otherwise, could you please take another look at the PR? > > > > Regards, > > > > Phillip > > > > > > On Wed, Mar 4, 2026 at 6:58 PM Rulin Xing <[email protected]> wrote: > > > >> Hey Phillip, > >> > >> Personally, I’m leaning toward following the Iceberg catalog properties > as > >> they are today. > >> > >> In the Iceberg SDK, there isn't a first-class property for project id, > >> it's > >> currently passed through the `header.x-goog-user-project`, even > biglake's > >> public doc follows this pattern. > >> > >> I'd also prefer not to mix BigLake-specific properties with > authentication > >> parameters since authentication is more generic, e.g., even for AWS > Glue, > >> we don't introduce a special class for it but implement a generic > >> SigV4AuthenticationParameters. As I mentioned earlier, the project id > is a > >> property required by the BigLake IRC, but it's not an authentication > >> parameter. > >> > >> Given that, my proposal would be: > >> > >> - Add an additional headers field in IcebergRestConnectionConfigInfo. > >> - I think it’s okay not to enforce checking the existence of this > >> property when the target catalog is GCP. > >> IcebergRestConnectionConfigInfo is > >> designed to connect to any Iceberg REST–compliant catalog, so if > >> a property > >> is only required by a specific catalog implementation, it > shouldn’t > >> be > >> validated generically at this layer. > >> - Introduce GcpAuthenticationParametersDpo, but limit it to two > fields > >> for now that follows *GoogleAuthManager*: > >> - GCP_CREDENTIALS_PATH_PROPERTY: if this property is not > provided, > >> it means that we will rely on gcp auth sdk to detect the gcp > >> credentials > >> from env > >> - GCP_SCOPES_PROPERTY > >> > >> Later, we can extend this to support impersonation flow and subscope > logic > >> by adding the necessary service identity-related functionality. > >> > >> Regards, > >> Rulin > >> > >> On Wed, Mar 4, 2026 at 3:43 AM Phillip Henry <[email protected]> > >> wrote: > >> > >> > Thanks, Rulin. > >> > > >> > Regarding points 1-3: would a step in the right direction be to call > the > >> > DTO *BigLake*AuthenticationParametersDpo rather than *Gcp* > >> > AuthenticationParametersDpo? I'm not trying to solve all the possible > >> GCP > >> > possibilities as I think that is beyond the scope of this ticket. > >> Rather, I > >> > just wanted to connect to my BigLake metastore - something the code > >> > currently enables. > >> > > >> > If this were acceptable to you, I'd then argue that quotaProject is > >> > analogous to, say, SigV4AuthenticationParameters.signingRegion insofar > >> as > >> > both basically define where the request is being sent (albeit one > >> > physically and one logically). If people disagree, I'm happy to put > >> > quotaProject elsewhere but I will need a steer on exactly where it > >> should > >> > go. > >> > > >> > What I'm trying to avoid, however, is the scope of this ticket > >> ballooning. > >> > > >> > Please let me know your thoughts. > >> > > >> > Regards, > >> > > >> > Phillip > >> > > >> > > >> > > >> > On Tue, Mar 3, 2026 at 6:47 PM Rulin Xing <[email protected]> wrote: > >> > > >> > > Hi Phillip, > >> > > > >> > > Thanks for starting this thread! I want to clarify a few points > >> regarding > >> > > to my comment: > >> > > > >> > > 1. GoogleAuthManager.java#L76-L77 > >> > > < > >> > > > >> > > >> > https://github.com/apache/iceberg/blob/1.10.x/gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java#L76-L77 > >> > > >: > >> > > GoogleAuthManager only relies on these two properties for now: > >> > > gcp.auth.credentials-path and gcp.auth.scopes. google project id > is > >> > not > >> > > needed by GoogleAuthManager > >> > > 2. We should clearly separate authentication parameters from > other > >> > > connection configurations. IcebergRestConnectionConfigInfo is > >> designed > >> > > to connect to any remote catalog that complies with the Iceberg > >> REST > >> > > spec, > >> > > while GcpAuthenticationParametersDpo should include only Google > >> > > authentication–related properties. It should be generic enough to > >> work > >> > > with > >> > > any catalog that uses Google Auth. > >> > > 3. Google Auth doesn't need the project id, it's GCP big lake > that > >> > > requires this property: > >> > > https://docs.cloud.google.com/biglake/docs/blms-rest-catalog > >> > > - If we look at Polaris GCP storage config, we also don’t need > >> to > >> > > provide a project ID, it only requires the gcs service > account: > >> > > (which is > >> > > not a good design since this property should be provided by > >> > > catalog service > >> > > and shouldn't be provided by end users): > >> > > > >> > > > >> > > >> > https://github.com/apache/polaris/blob/release/1.3.x/spec/polaris-management-service.yml#L1162-L1171 > >> > > - If customers host their catalog server on GCP and use Google > >> Auth > >> > > for authentication, they do not need to specify a project id > >> > > when creating > >> > > the Polaris external catalog entity. > >> > > - For example, if they use GCP Proxy Service (API Gateway) > to > >> > > expose REST APIs and use google auth, no project id is > >> required. > >> > > - GCP BigLake requires both header.x-goog-user-project and > >> > warehouse > >> > > (catalog name). These are used to identify the catalog on the > >> > server > >> > > side > >> > > because BigLake supports multiplexing (it’s a multi-tenant > >> > > catalog service > >> > > where each tenant can have multiple catalogs). Neither of > these > >> > > properties > >> > > is related to authentication. > >> > > - From a design perspective, this is not ideal. For > >> comparison, > >> > in > >> > > Glue, a single property is used to achieve the same > purpose, > >> > > e.g.,: > >> > > - warehouse=<aws_account_id>:s3tables/<catalog_name> > >> > > - Here, aws_account_id is equivalent to the Google > project > >> > ID, > >> > > - and s3tables/<catalog_name> is equivalent to BigLake’s > >> > > warehouse. > >> > > 4. Another part that is not mentioned is the service > identity > >> > > part, we can discuss this later > >> > > > >> > > Thanks. > >> > > Rulin > >> > > > >> > > On Mon, Mar 2, 2026 at 6:40 AM Phillip Henry < > [email protected] > >> > > >> > > wrote: > >> > > > >> > > > Regarding the points Rulin Xing raises on ticket 3729 > >> > > > < > https://github.com/apache/polaris/pull/3729#discussion_r2866294655 > >> >, > >> > I > >> > > > wanted to get some feedback from the community on the following. > >> > > > > >> > > > > >> > > > 1. I'd argue that project id property should not live on > >> > > > IcebergRestConnectionConfigInfo as it's GCP specific AFAIK > >> > > > 2. If it were to live in IcebergRestConnectionConfigInfo as a > >> map of > >> > > > properties in (I think that's what you're suggesting RX - > >> correct me > >> > > if > >> > > > I'm > >> > > > wrong) then its presence could not be enforced for GCP calls. > >> > > > 3. GcpAuthenticationParametersDpo "doesn't contain any > >> > > > authentication-related info" but it does contain only what is > >> > > necessary > >> > > > to > >> > > > trigger AuthManagers to use GCP and also to provide the means > to > >> do > >> > > that > >> > > > via its asIcebergCatalogProperties and > >> > asAuthenticationParametersModel > >> > > > methods. I'd argue that this way of doing it leverages the > >> machinery > >> > > > that's > >> > > > already in place and minimally impacts the rest of the > codebase. > >> > > > > >> > > > > >> > > > So, I'd propose removing the warehouse field as RX has helpfully > >> > pointed > >> > > > out but I think GcpAuthenticationParametersDpo should stay even if > >> its > >> > > DTO > >> > > > counterpart is revised. > >> > > > > >> > > > Thoughts? > >> > > > > >> > > > >> > > >> > > >
