Hi Yun,

Just trying to achieve clarity too:

> I’d like to clarify my point regarding configuration ownership. [...]

Who defines property names such as "s3.secret-access-key" and their meaning?

My view is that Polaris defines those names and links their meaning to the
corresponding concepts in S3 API / scec. Polaris exposes those values to
its clients, therefore they should have a common understanding of the
meaning of those values. Since Polaris owns the API that exposes those
properties, Polaris also controls their definitions.

Even though properties must align with a storage provider's API, Polaris
still needs to clearly define their meaning.

Therefore, if Polaris were to expose new similar properties for another
technology, it would effectively require a change in Polaris specs to
define the new names and what the corresponding values represent. WDYT?

I previously mentioned this in [1] but the emphasis was on using structured
OpenAPI types. Now, I agree that a property bag is fine to use as the API
type, but I still think adding new property names requires a spec change
(in the description section, given the current PR state).

> >>> you're planning to expose different ordianry configuration for
different storage prefixes too
> Could you clarify this question a bit further?

I mean: does the "prefix" in the current API spec [2] apply to
non-credential properties?

[1] https://github.com/apache/polaris/pull/3826#discussion_r2835541976

[2] https://github.com/apache/polaris/pull/3826#discussion_r2878717169

Thanks,
Dmitri.

On Mon, Mar 2, 2026 at 7:14 PM yun zou <[email protected]> wrote:

> Hi Dmitri,
>
> I’d like to clarify my point regarding configuration ownership.
> Ultimately, it is the storage providers who determine what
> configurations are required to access their storage systems. Polaris
> simply provides a mechanism to capture and pass along those
> configurations. In that sense, the required configuration is defined
> and controlled by the storage providers, not by Polaris.
>
> From my understanding, Iceberg introduced the new storageCredential
> field mainly for two reasons:
> 1) The existing config field contains a mix of different configuration
> types and lacks clear documentation, making it difficult to reuse or
> extend for other purposes.
> 2) A new credentials endpoint was introduced, which requires a clearer
> and more structured definition of the configurations it accepts.
>
> For Polaris, as you mentioned, we have the flexibility to design this
> correctly from the start. Since we do not plan to introduce a separate
> credentials endpoint, and there are only very limited additional
> configurations required beyond the credentials themselves for storage
> access, it would be more practical to keep everything consolidated to
> maintain simplicity.
>
> Furthermore, because we will clearly document all supported
> configurations, we can establish a well-defined boundary for what
> belongs in this field and avoid the ambiguity seen elsewhere.
>
> >>> you're planning to expose different ordianry configuration for
> different storage prefixes too
> Could you clarify this question a bit further? My proposal remains the
> same as what the current PR outlines (except the naming): we would
> have a single configuration bag that contains all configurations
> required to access storage, and different providers would be
> distinguished based on the storage prefix.
>
> Best Regards,
> Yun
>
> On Mon, Mar 2, 2026 at 3:40 PM Dmitri Bourlatchkov <[email protected]>
> wrote:
> >
> > Hi Yun,
> >
> > I'm fine with a property bag.
> >
> > Still, for the sake of clarity, I'd like to mention that it is not
> storage
> > providers who control this configuration. Polaris controls it via its API
> > specification.
> >
> > Storage Providers control the meaning of values that go into those
> > properties. Property names remain within the scope of Polaris'
> > responsibility to define and document for its clients.
> >
> > So, adding a new storage provider type will require updating the Polaris
> > API spec to clearly define the new property names and how they relate to
> > what the storage provider expects for request authentication.
> >
> > Re: the separation of plain configuration and credentials, my reading of
> > the Iceberg discussion makes me lead toward separating them upfront. This
> > is what the Iceberg community seems to prefer too, but they are burdened
> by
> > backward compatibility to older clients. Polaris does not have that
> burden
> > since we're exposing vended credentials for the first time in the Generic
> > Tables API. WDYT?
> >
> > Perhaps I misunderstood this proposal and you're planning to expose
> > different ordianry configuration for different storage prefixes too
> > (similar to credentials)... Is that so?
> >
> > Thanks,
> > Dmitri.
> >
> >
> >
> > On Mon, Mar 2, 2026 at 4:37 PM yun zou <[email protected]>
> wrote:
> >
> > > Hi Dmitri,
> > >
> > > Thanks for bringing up this discussion.
> > >
> > > I believe keeping the storage configuration as a property bag is more
> > > beneficial for us, since these settings are fundamentally controlled
> > > by the cloud provider rather than Polaris. Leaving it as a property
> > > bag gives us more flexibility and allows us to onboard new I/O
> > > providers more quickly. Onboarding one additional provider may not
> > > seem significant, but if we need to do this more frequently, it could
> > > add up to a substantial amount of work.
> > >
> > > I’m definitely +1 on having clear documentation.
> > >
> > > Regarding other parameters such as client.region, since these are all
> > > configurations required to initialize the client and access the
> > > storage, I don’t think we need to split them into separate fields.
> > > That said, I agree that the name StorageAccessCredentials can be
> > > misleading. Perhaps we could rename it to StorageAccessConfigs, to
> > > better reflect that it includes all configurations needed to access
> > > remote storage, including credentials. WDYT?
> > >
> > > By the way, I also started a thread in the Apache Iceberg community to
> > > better understand the historical context behind the
> > > storage-credentials fields. From the discussion so far, it seems that
> > > one of the main reasons is the lack of clear documentation.
> > >
> > > Best regards,
> > > Yun
> > >
> > > On Mon, Feb 23, 2026 at 9:48 AM Dmitri Bourlatchkov <[email protected]>
> > > wrote:
> > > >
> > > > Hi Alex,
> > > >
> > > > I'm generally fine with not distinguishing credential properties at
> the
> > > API
> > > > level. So property bags for table-defaults plus path-specific
> overrides
> > > > sound reasonable to me.
> > > >
> > > > However, Iceberg has a separate credentials section in
> LoadTableResult.
> > > Do
> > > > you know the rationale for that?
> > > >
> > > > [1]
> > > >
> > >
> https://github.com/apache/iceberg/blob/apache-iceberg-1.10.1/open-api/rest-catalog-open-api.yaml#L3321
> > > >
> > > > Cheers,
> > > > Dmitri.
> > > >
> > > > On Mon, Feb 23, 2026 at 12:07 PM Alexandre Dutra <[email protected]>
> > > wrote:
> > > >
> > > > > Hi Dmitri,
> > > > >
> > > > > + 1 for keeping the property bag at the API level and document
> > > > > well-known properties.
> > > > >
> > > > > On the credential vs non-credential topic: my point is that in
> > > > > Iceberg's `LoadTableResult`, the `config` field could contain
> storage
> > > > > credentials as well, and that would be, afaict, perfectly valid:
> the
> > > > > credentials would then be valid for any prefix. Polaris, btw, does
> > > > > exactly that [1]. Therefore it would be more accurate, strictly
> > > > > speaking, to distinguish two scopes: "table  defaults", and
> > > > > "prefix-specific overrides"; these would be orthogonal to the
> > > > > credential vs non-credential distinction, as illustrated in this
> > > > > example: [2].
> > > > >
> > > > > Thanks,
> > > > > Alex
> > > > >
> > > > > [1]:
> > > > >
> > >
> https://github.com/apache/polaris/blob/8b108d6be7222a8ed78b1b2b70816ecbeea1b327/runtime/service/src/main/java/org/apache/polaris/service/catalog/iceberg/IcebergCatalogHandler.java#L876-L881
> > > > > [2]:
> https://gist.github.com/adutra/1b6adb2c169c08bd91cd2a01ab4338d3
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Feb 23, 2026 at 4:12 PM Dmitri Bourlatchkov <
> [email protected]>
> > > > > wrote:
> > > > > >
> > > > > > Hi Alex,
> > > > > >
> > > > > > Re: config property mixing, my concern is not so much about
> prefix
> > > vs.
> > > > > > non-prefix config, but more about credentials vs. non-credentials
> > > > > > configuration.
> > > > > >
> > > > > > I believe distinguishing them at the API level will help
> > > implementations
> > > > > > treat them correctly too. Conversely, mixing credentials with
> other
> > > > > config
> > > > > > increases the chance of mis-handling them on the client side
> (e.g.
> > > > > logging
> > > > > > when they are not supposed to be logged).
> > > > > >
> > > > > > Your point about the fact that these properties will eventually
> be
> > > mixed
> > > > > on
> > > > > > the client side is quite valid. Therefore, I will not insist on
> > > > > separating
> > > > > > credential properties from ordinary config in the API.
> > > > > >
> > > > > > However, if the properties are mixed we should not use the name
> > > > > > "StorageAccessCredential" (note the last word) for their
> container,
> > > > > because
> > > > > > that (IMHO) would actually make the API spec confusing. I'd
> suggest
> > > > > naming
> > > > > > the container "StorageAccessConfiguration" (other names are
> welcome
> > > too).
> > > > > >
> > > > > > Re: well-structured properties objects - after thinking about it
> some
> > > > > more,
> > > > > > I believe your points about client compatibility prevail. I'd
> think
> > > > > clients
> > > > > > should be able to hangle new / unknown JSON properties, but it
> would
> > > > > > certainly be an overhead in many cases.
> > > > > >
> > > > > > I'm fine using a generic property bag as the type at the API
> level.
> > > > > >
> > > > > > I still believe we should document exact property names in the
> spec.
> > > For
> > > > > > that matter I propose moving property descriptions from Open API
> > > > > _comments_
> > > > > > to Open API _description_ fields. This way, notes about property
> > > meaning
> > > > > > will not be lost after automated processing of the spec (e.g. by
> > > > > > SwaggerHub). WDYT?
> > > > > >
> > > > > > Thanks,
> > > > > > Dmitri.
> > > > > >
> > > > > > On Mon, Feb 23, 2026 at 7:27 AM Alexandre Dutra <
> [email protected]>
> > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > > I do not think it is correct to mix the former with the
> latter.
> > > > > > >
> > > > > > > I don't think mixing general and prefix-specific
> configurations is
> > > > > > > inherently incorrect—it's a grey area. However, I agree that
> it is
> > > > > > > likely better practice to separate them. I believe the
> sensitivity
> > > > > > > level of the configuration should not be the deciding factor,
> as
> > > any
> > > > > > > configuration in a LoadTable response should be treated as
> > > sensitive.
> > > > > > >
> > > > > > > In Iceberg, the distinction in `LoadTableResult` between
> `config`
> > > and
> > > > > > > `storage-credentials.config` is primarily one of scope: the
> former
> > > > > > > applies broadly to the entire FileIO, and the latter is
> specific
> > > to an
> > > > > > > S3 client tied to a particular prefix. It's important to note,
> > > though,
> > > > > > > that any table-wide properties are ultimately merged with
> > > > > > > prefix-specific properties when the S3 client is created [1].
> > > > > > >
> > > > > > > > Open API can be leveraged to report them as well-structured
> > > objects
> > > > > > > (JSON)
> > > > > > >
> > > > > > > The idea is interesting, especially since we control the
> > > specification
> > > > > > > and thus know the supported properties.
> > > > > > >
> > > > > > > However, this approach may complicate the evolution of the
> > > > > > > specification. Our guidance on evolution [2] states that new
> > > releases
> > > > > > > of Polaris should maintain compatibility with older clients.
> This
> > > > > > > requires that any addition or deprecation of properties must be
> > > done
> > > > > > > in a backward-compatible way. Removing or renaming properties
> would
> > > > > > > necessitate a major version bump for the specification. I am
> not
> > > > > > > convinced that the potential benefits outweigh these
> consequences.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Alex
> > > > > > >
> > > > > > > [1]:
> > > > > > >
> > > > >
> > >
> https://github.com/apache/iceberg/blob/fec9800bcc0c4073ca727f3b3bfdc2f34abb26a3/aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java#L411-L415
> > > > > > > [2]: https://polaris.apache.org/releases/1.3.0/evolution/
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Feb 21, 2026 at 1:12 AM Dmitri Bourlatchkov <
> > > [email protected]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > I'm transferring some points from my GH comments [3826] here,
> > > for a
> > > > > wider
> > > > > > > > discussion.
> > > > > > > >
> > > > > > > > 1) Apparently some of the response properties relate to
> actual
> > > > > > > credentials
> > > > > > > > (key, expiry time), while others are more general
> configuration
> > > items
> > > > > > > (e.g.
> > > > > > > > the refresh endpoint).
> > > > > > > >
> > > > > > > > I do not think it is corrent to mix the former with the
> latter.
> > > > > Primarily
> > > > > > > > because of their different leak sensitivity levels but also
> > > because
> > > > > > > > `StorageAccessCredential` are provided as a list, and I
> wonder
> > > why we
> > > > > > > would
> > > > > > > > want to send multiple endpoint config entries (one for each
> > > location
> > > > > > > > prefix).
> > > > > > > >
> > > > > > > > 2) Currently properties are defined as an unstructured bag of
> > > > > key/value
> > > > > > > > pair. I think Open API can be leveraged to report them as
> > > > > well-structured
> > > > > > > > objects (JSON).
> > > > > > > >
> > > > > > > > Server code changes are required to add suport for new
> properties
> > > > > anyway.
> > > > > > > > It should not be too difficult to evolve the Open API types
> at
> > > the
> > > > > > > > same time. We've done it many times already (e.g. to support
> > > non-AWS
> > > > > S3
> > > > > > > > storage).
> > > > > > > >
> > > > > > > > WDYT?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Dmitri.
> > > > > > > >
> > > > > > > > [3826]
> > > > > https://github.com/apache/polaris/pull/3826/changes#r2829649503
> > > > > > > >
> > > > > > > > On Sat, Feb 7, 2026 at 1:07 AM Jack Ye <[email protected]>
> > > wrote:
> > > > > > > >
> > > > > > > > > > How are clients supposed to know the meaning of
> credential
> > > > > > > properties in
> > > > > > > > > the "config" section?
> > > > > > > > >
> > > > > > > > > I think the contract is clear: since we are using the
> Iceberg
> > > REST
> > > > > > > model
> > > > > > > > > for StorageCredential, it must use the same configurations
> as
> > > the
> > > > > > > Iceberg
> > > > > > > > > credentials vending-related ones [1].
> > > > > > > > >
> > > > > > > > > Based on the current Polaris implementation [2], the full
> list
> > > of
> > > > > > > configs
> > > > > > > > > is probably the following:
> > > > > > > > >
> > > > > > > > > S3:
> > > > > > > > >   Credentials:
> > > > > > > > >   - s3.access-key-id - AWS access key ID
> > > > > > > > >   - s3.secret-access-key - AWS secret access key
> > > > > > > > >   - s3.session-token - Temporary STS session token
> > > > > > > > >   - s3.session-token-expires-at-ms - Token expiration
> timestamp
> > > > > (ms)
> > > > > > > > >   Extra Properties:
> > > > > > > > >   - s3.endpoint - S3 endpoint URI (optional)
> > > > > > > > >   - s3.path-style-access - Path-style access flag
> (optional)
> > > > > > > > >   - client.region - AWS region
> > > > > > > > >   - aws.refresh-credentials-endpoint - Credential refresh
> > > endpoint
> > > > > > > > > (optional)
> > > > > > > > >
> > > > > > > > > GCS:
> > > > > > > > >   Credentials:
> > > > > > > > >   - gcs.oauth2.token - Downscoped OAuth2 access token
> > > > > > > > >   - gcs.oauth2.token-expires-at - Token expiration
> timestamp
> > > (ms)
> > > > > > > > >   Extra Properties:
> > > > > > > > >   - gcs.oauth2.refresh-credentials-endpoint - Credential
> > > refresh
> > > > > > > endpoint
> > > > > > > > > (optional)
> > > > > > > > >
> > > > > > > > > Azure:
> > > > > > > > >   Credentials:
> > > > > > > > >   - adls.sas-token.<hostname> - SAS token keyed by storage
> > > account
> > > > > > > hostname
> > > > > > > > >   - adls.sas-token-expires-at-ms.<hostname> - SAS token
> > > expiration
> > > > > (ms)
> > > > > > > > >   Extra Properties:
> > > > > > > > >   - adls.refresh-credentials-endpoint - Credential refresh
> > > endpoint
> > > > > > > > > (optional)
> > > > > > > > >
> > > > > > > > > It would be helpful to specify those at least in the
> generic
> > > tables
> > > > > > > spec,
> > > > > > > > > ideally also in the Iceberg REST spec, which currently only
> > > lists
> > > > > S3
> > > > > > > > > configs and is already outdated.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Jack Ye
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/polaris/blob/main/spec/iceberg-rest-catalog-open-api.yaml#L3299C1-L3306C1
> > > > > > > > > [2]
> > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/storage/StorageAccessProperty.java
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Feb 6, 2026 at 11:03 AM Eric Maynard <
> > > > > [email protected]
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > > How are clients supposed to know the meaning of
> credential
> > > > > > > properties
> > > > > > > > > in
> > > > > > > > > > the "config" section?
> > > > > > > > > >
> > > > > > > > > > How are clients supposed to know the meaning of *any*
> > > properties
> > > > > > > written
> > > > > > > > > to
> > > > > > > > > > a generic table?
> > > > > > > > > >
> > > > > > > > > > That clients need to interpret the payload of a generic
> table
> > > > > > > response is
> > > > > > > > > > already intrinsic to the generic table design. True,
> adding
> > > > > > > credential
> > > > > > > > > > vending pushes generic table support towards the service
> > > being
> > > > > > > slightly
> > > > > > > > > > more opinionated about the generic table metadata (i.e.
> the
> > > > > location
> > > > > > > is
> > > > > > > > > now
> > > > > > > > > > implied to be a place that may require credentials to
> access)
> > > > > but as
> > > > > > > this
> > > > > > > > > > would be an opt-in for your generic tables I don't see
> this
> > > as a
> > > > > > > blocking
> > > > > > > > > > issue.
> > > > > > > > > >
> > > > > > > > > > --EM
> > > > > > > > > >
> > > > > > > > > > On Fri, Feb 6, 2026 at 10:57 AM Dmitri Bourlatchkov <
> > > > > > > [email protected]>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Yun,
> > > > > > > > > > >
> > > > > > > > > > > The proposal looks reasonable to me in general after a
> > > quick
> > > > > > > review of
> > > > > > > > > > the
> > > > > > > > > > > doc.
> > > > > > > > > > >
> > > > > > > > > > > I have one concern, though, which may be a whole can of
> > > worms,
> > > > > I'm
> > > > > > > > > afraid
> > > > > > > > > > > :)
> > > > > > > > > > >
> > > > > > > > > > > How are clients supposed to know the meaning of
> credential
> > > > > > > properties
> > > > > > > > > in
> > > > > > > > > > > the "config" section? The doc proposes to define it as
> a
> > > > > generic
> > > > > > > > > property
> > > > > > > > > > > bag.
> > > > > > > > > > >
> > > > > > > > > > > The example appears to use properties that Iceberg
> (java?)
> > > > > clients
> > > > > > > > > might
> > > > > > > > > > > use in a similar situation. However, the Generic Tables
> > > API is
> > > > > not
> > > > > > > > > > related
> > > > > > > > > > > to Iceberg in any way (AFAIK).
> > > > > > > > > > >
> > > > > > > > > > > Plus, I do not think these properties are well-defined
> > > even in
> > > > > > > Iceberg
> > > > > > > > > > > specifications.
> > > > > > > > > > >
> > > > > > > > > > > WDYT?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Dmitri.
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Feb 6, 2026 at 1:29 PM yun zou <
> > > > > [email protected]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi All,
> > > > > > > > > > > >
> > > > > > > > > > > > Generic Tables have been available since Polaris
> 1.0.0
> > > and
> > > > > have
> > > > > > > seen
> > > > > > > > > > > > growing interest from an increasing number of
> customers.
> > > > > > > > > > > >
> > > > > > > > > > > > However, the current Generic Table capability has
> some
> > > > > > > limitations.
> > > > > > > > > One
> > > > > > > > > > > > key gap is the lack of credential vending support.
> > > Without
> > > > > > > credential
> > > > > > > > > > > > vending, query engines must access tables using
> > > long-lived,
> > > > > > > static
> > > > > > > > > > cloud
> > > > > > > > > > > > credentials configured directly in the engine
> runtime,
> > > which
> > > > > > > limits
> > > > > > > > > > both
> > > > > > > > > > > > usability and security.
> > > > > > > > > > > >
> > > > > > > > > > > > To address this, we propose adding credential vending
> > > > > support for
> > > > > > > > > > Generic
> > > > > > > > > > > > Tables. This enhancement would allow a Polaris
> catalog to
> > > > > > > dynamically
> > > > > > > > > > > vend
> > > > > > > > > > > > short-lived, scoped storage credentials to query
> engines
> > > at
> > > > > > > runtime
> > > > > > > > > > when
> > > > > > > > > > > > accessing Generic Tables.
> > > > > > > > > > > >
> > > > > > > > > > > > The goals of this proposal are to:
> > > > > > > > > > > >
> > > > > > > > > > > >    1.
> > > > > > > > > > > >
> > > > > > > > > > > >    Enable credential vending support for Generic
> Tables
> > > in
> > > > > > > Polaris
> > > > > > > > > > > >    2.
> > > > > > > > > > > >
> > > > > > > > > > > >    Deliver an end-to-end experience for currently
> > > supported
> > > > > table
> > > > > > > > > > > >    formats, including Delta, Hudi, and Lance
> > > > > > > > > > > >    3.
> > > > > > > > > > > >
> > > > > > > > > > > >    Maintain consistency with the existing Iceberg
> > > credential
> > > > > > > vending
> > > > > > > > > > > model
> > > > > > > > > > > >
> > > > > > > > > > > > Please find the attached short design document with
> > > > > additional
> > > > > > > > > details.
> > > > > > > > > > > We
> > > > > > > > > > > > would appreciate your review and valuable feedback.
> > > > > > > > > > > >
> > > > > > > > > > > > link to google doc:
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
> https://docs.google.com/document/d/1QzTx4tcS23_mF-gc77GbTqtwuRHY5f_Aa_6E4VSKFU4/edit?tab=t.0#heading=h.rpqtaz73xt4v
> > > > > > > > > > > >
> > > > > > > > > > > > Best Regards,
> > > > > > > > > > > >
> > > > > > > > > > > > Yun
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>

Reply via email to