Thanks for the approval on the PR, Rulin!

But a quick question on protocol: do I make the changes gh-yzou is asking
for then ask you to approve again?

(This is my first contribution to Polaris so I don't know what is expected
of me.)

Regards,

Phillip


On Thu, Mar 5, 2026 at 2:35 PM Phillip Henry <[email protected]>
wrote:

> Hi, Rulin.
>
> I've done as you've suggested with the exception of adding
> GCP_CREDENTIALS_PATH_PROPERTY and GCP_SCOPES_PROPERTY to
> GcpAuthenticationParametersDpo as they would not be used in this ticket and
> I would recommend adding them in another ticket that uses them.
>
> Having said that, I don't feel strongly about this and am happy to add if
> you really want them in this ticket.
>
> Otherwise, could you please take another look at the PR?
>
> Regards,
>
> Phillip
>
>
> On Wed, Mar 4, 2026 at 6:58 PM Rulin Xing <[email protected]> wrote:
>
>> Hey Phillip,
>>
>> Personally, I’m leaning toward following the Iceberg catalog properties as
>> they are today.
>>
>> In the Iceberg SDK, there isn't a first-class property for project id,
>> it's
>> currently passed through the `header.x-goog-user-project`, even biglake's
>> public doc follows this pattern.
>>
>> I'd also prefer not to mix BigLake-specific properties with authentication
>> parameters since authentication is more generic, e.g., even for AWS Glue,
>> we don't introduce a special class for it but implement a generic
>> SigV4AuthenticationParameters. As I mentioned earlier, the project id is a
>> property required by the BigLake IRC, but it's not an authentication
>> parameter.
>>
>> Given that, my proposal would be:
>>
>>    - Add an additional headers field in IcebergRestConnectionConfigInfo.
>>       - I think it’s okay not to enforce checking the existence of this
>>       property when the target catalog is GCP.
>> IcebergRestConnectionConfigInfo is
>>       designed to connect to any Iceberg REST–compliant catalog, so if
>> a property
>>       is only required by a specific catalog implementation, it shouldn’t
>> be
>>       validated generically at this layer.
>>    - Introduce GcpAuthenticationParametersDpo, but limit it to two fields
>>    for now that follows *GoogleAuthManager*:
>>       -   GCP_CREDENTIALS_PATH_PROPERTY: if this property is not provided,
>>       it means that we will rely on gcp auth sdk to detect the gcp
>> credentials
>>       from env
>>       -   GCP_SCOPES_PROPERTY
>>
>> Later, we can extend this to support impersonation flow and subscope logic
>> by adding the necessary service identity-related functionality.
>>
>> Regards,
>> Rulin
>>
>> On Wed, Mar 4, 2026 at 3:43 AM Phillip Henry <[email protected]>
>> wrote:
>>
>> > Thanks, Rulin.
>> >
>> > Regarding points 1-3: would a step in the right direction be to call the
>> > DTO *BigLake*AuthenticationParametersDpo rather than *Gcp*
>> > AuthenticationParametersDpo? I'm not trying to solve all the possible
>> GCP
>> > possibilities as I think that is beyond the scope of this ticket.
>> Rather, I
>> > just wanted to connect to my BigLake metastore - something the code
>> > currently enables.
>> >
>> > If this were acceptable to you, I'd then argue that quotaProject is
>> > analogous to, say, SigV4AuthenticationParameters.signingRegion insofar
>> as
>> > both basically define where the request is being sent (albeit one
>> > physically and one logically). If people disagree, I'm happy to put
>> > quotaProject elsewhere but I will need a steer on exactly where it
>> should
>> > go.
>> >
>> > What I'm trying to avoid, however, is the scope of this ticket
>> ballooning.
>> >
>> > Please let me know your thoughts.
>> >
>> > Regards,
>> >
>> > Phillip
>> >
>> >
>> >
>> > On Tue, Mar 3, 2026 at 6:47 PM Rulin Xing <[email protected]> wrote:
>> >
>> > > Hi Phillip,
>> > >
>> > > Thanks for starting this thread! I want to clarify a few points
>> regarding
>> > > to my comment:
>> > >
>> > >    1. GoogleAuthManager.java#L76-L77
>> > >    <
>> > >
>> >
>> https://github.com/apache/iceberg/blob/1.10.x/gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java#L76-L77
>> > > >:
>> > >    GoogleAuthManager only relies on these two properties for now:
>> > >    gcp.auth.credentials-path and gcp.auth.scopes. google project id is
>> > not
>> > >    needed by GoogleAuthManager
>> > >    2. We should clearly separate authentication parameters from other
>> > >    connection configurations. IcebergRestConnectionConfigInfo is
>> designed
>> > >    to connect to any remote catalog that complies with the Iceberg
>> REST
>> > > spec,
>> > >    while GcpAuthenticationParametersDpo should include only Google
>> > >    authentication–related properties. It should be generic enough to
>> work
>> > > with
>> > >    any catalog that uses Google Auth.
>> > >    3. Google Auth doesn't need the project id, it's GCP big lake that
>> > >    requires this property:
>> > >    https://docs.cloud.google.com/biglake/docs/blms-rest-catalog
>> > >       - If we look at Polaris GCP storage config, we also don’t need
>> to
>> > >       provide a project ID, it only requires the gcs service account:
>> > > (which is
>> > >       not a good design since this property should be provided by
>> > > catalog service
>> > >       and shouldn't be provided by end users):
>> > >
>> > >
>> >
>> https://github.com/apache/polaris/blob/release/1.3.x/spec/polaris-management-service.yml#L1162-L1171
>> > >       - If customers host their catalog server on GCP and use Google
>> Auth
>> > >       for authentication, they do not need to specify a project id
>> > > when creating
>> > >       the Polaris external catalog entity.
>> > >          - For example, if they use GCP Proxy Service (API Gateway) to
>> > >          expose REST APIs and use google auth, no project id is
>> required.
>> > >       - GCP BigLake requires both header.x-goog-user-project and
>> > warehouse
>> > >       (catalog name). These are used to identify the catalog on the
>> > server
>> > > side
>> > >       because BigLake supports multiplexing (it’s a multi-tenant
>> > > catalog service
>> > >       where each tenant can have multiple catalogs). Neither of these
>> > > properties
>> > >       is related to authentication.
>> > >          - From a design perspective, this is not ideal. For
>> comparison,
>> > in
>> > >          Glue, a single property is used to achieve the same purpose,
>> > > e.g.,:
>> > >             - warehouse=<aws_account_id>:s3tables/<catalog_name>
>> > >             - Here, aws_account_id is equivalent to the Google project
>> > ID,
>> > >             - and s3tables/<catalog_name> is equivalent to BigLake’s
>> > >             warehouse.
>> > >          4. Another part that is not mentioned is the service identity
>> > >    part, we can discuss this later
>> > >
>> > > Thanks.
>> > > Rulin
>> > >
>> > > On Mon, Mar 2, 2026 at 6:40 AM Phillip Henry <[email protected]
>> >
>> > > wrote:
>> > >
>> > > > Regarding the points Rulin Xing raises on ticket 3729
>> > > > <https://github.com/apache/polaris/pull/3729#discussion_r2866294655
>> >,
>> > I
>> > > > wanted to get some feedback from the community on the following.
>> > > >
>> > > >
>> > > >    1. I'd argue that project id property should not live on
>> > > >    IcebergRestConnectionConfigInfo as it's GCP specific AFAIK
>> > > >    2. If it were to live in IcebergRestConnectionConfigInfo as a
>> map of
>> > > >    properties in (I think that's what you're suggesting RX -
>> correct me
>> > > if
>> > > > I'm
>> > > >    wrong) then its presence could not be enforced for GCP calls.
>> > > >    3. GcpAuthenticationParametersDpo "doesn't contain any
>> > > >    authentication-related info" but it does contain only what is
>> > > necessary
>> > > > to
>> > > >    trigger AuthManagers to use GCP and also to provide the means to
>> do
>> > > that
>> > > >    via its asIcebergCatalogProperties and
>> > asAuthenticationParametersModel
>> > > >    methods. I'd argue that this way of doing it leverages the
>> machinery
>> > > > that's
>> > > >    already in place and minimally impacts the rest of the codebase.
>> > > >
>> > > >
>> > > > So, I'd propose removing the warehouse field as RX has helpfully
>> > pointed
>> > > > out but I think GcpAuthenticationParametersDpo should stay even if
>> its
>> > > DTO
>> > > > counterpart is revised.
>> > > >
>> > > > Thoughts?
>> > > >
>> > >
>> >
>>
>

Reply via email to