I agree with YuFei. Until we identify more concrete use cases, the *inline
model* seems to be the best starting point. It is particularly well-suited
for sparse configurations, where only a few tables in a namespace require
overrides while the rest remain unchanged.

*Next Steps:* Unless there are any objections, I will update the design doc
to reflect this approach. Once approved, I will proceed with implementation.

On Wed, Feb 11, 2026 at 3:49 PM Yufei Gu <[email protected]> wrote:

> I’d suggest we start from concrete use cases.
>
> If the inline model(Option 2) works well for the primary scenarios, e.g.,
> relatively sparse table level storage overrides, we could adopt it as a
> first phase. It keeps the implementation simple and lets us validate real
> world needs before introducing additional abstractions.
>
> However, if we anticipate frequent configuration rotation or strong reuse
> requirements across many tables, Option 1 is more compelling. In that case,
> I'd recommend reusing the existing policy framework where possible, since
> it already provides inheritance and attachment semantics. That could help
> us avoid introducing significant new complexity into Polaris while still
> supporting the richer model.
> Yufei
>
>
> On Wed, Feb 11, 2026 at 9:12 AM Dmitri Bourlatchkov <[email protected]>
> wrote:
>
> > Hi Srinivas,
> >
> > Thanks for the discussion recap! It's very useful to keep the dev thread
> > and meetings aligned.
> >
> > Option 1:
> > Credential Rotation: Highly efficient. Because the configuration is
> > referenced by ID, rotating a cloud IAM role or secret requires updating
> > only the single StorageConfiguration entity. [...]
> >
> >
> > This seems to imply that credentials are stored as part of the Storage
> > Configuration Entity. If so, I do not think this approach is ideal. I
> > believe the secret data should ideally be accessed via the Secrets
> Manager
> > [1]. While that discussion is still in progress, I believe it
> interconnects
> > with this proposal.
> >
> > [...] All thousands of downstream
> > tables referencing it would immediately use the new credentials without
> > metadata updates.
> >
> >
> > Immediacy is probably from the end-user's perspective. Internally,
> > different Polaris processes may switch to the updated config at
> > different moments in time... I do not think it is a problem in this case,
> > just wanted to highlight it to make sure distributed system aspects are
> not
> > left out :)
> >
> > Option 2:
> > Credential Rotation: Credential rotation is difficult [...]
> >
> >
> > Again, I believe actual credentials should be accessed via the Secrets
> > Manager [1] so some indirection will be present.
> >
> > Config updates will need to happen individually in each case, but actual
> > secrets could be shared and updated centrally via the Secrets Manager.
> >
> > ATM, given the complexity points about option 1 that were brought up in
> the
> > community sync, I tend to favour this option for implementing this
> > proposal. However, this is not a strong requirement by any means, just my
> > personal opinion. Other opinions are welcome.
> >
> > Depending on how secret references are handled in code (needs a POC, I
> > guess), there could be some synergy with Tornike's approach from [3699].
> >
> > Option 3: Named Catalog-Level Configurations (Hybrid) [...]
> >
> >
> > I would like to clarify the UX story in this case. Do we expect end users
> > to manage Storage Configuration in this case or the Polaris owner?
> >
> > In the latter case, it seems similar to Tornike's proposal in [3699] but
> > generalized to all storage types. The Polaris Admin / Owner could use a
> > non-public API to work with this configuration (e.g. plain Quarkus
> > configuration or possibly Admin CLI).
> >
> > Option 4: Leverage Existing Policy Framework [...]
> >
> >
> > I tend to agree with the "semantic confusion" point.
> >
> > It should be fine to reuse policy-related code in the implementation (if
> > possible), but I believe Storage Configuration and related credential
> > management form a distinct use case / feature and deserve dedicated
> > handling in Polaris and the API / UX level.
> >
> > [1] https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f
> >
> > [3699] https://github.com/apache/polaris/pull/3699
> >
> > Thanks,
> > Dmitri.
> >
> > On Tue, Feb 10, 2026 at 10:19 PM Srinivas Rishindra <
> > [email protected]>
> > wrote:
> >
> > > Hi Everyone,
> > >
> > > We had an opportunity to discuss this feature and my recent proposal at
> > > the last community sync meeting. I would like to summarize our
> > discussion
> > > and enumerate the various options we considered to help us reach a
> > > consensus.
> > >
> > > To recap, storage configuration is currently restricted at the catalog
> > > level. This limits flexibility for users who need to organize tables
> > across
> > > different storage configurations or cloud providers within a single
> > > catalog. There appears to be general agreement on the utility of this
> > > feature; however, we still need to align on the specific implementation
> > > approach.
> > >
> > > Here are the various options that were considered.
> > > *Option 0: Make Credentials available as part of table properties.
> *(This
> > > was my original proposal, but abandoned after becoming aware of the
> > > security implications.)
> > >
> > > *Option 1: First-Class Storage Configuration Entity *
> > >
> > > This approach proposes elevating StorageConfiguration to a standalone,
> > > top-level resource in the Polaris backend (similar to a Principal,
> > > Namespace or Table), independent of the Catalog or Table. This is the
> > > approach in my most recent proposal doc.
> > > -
> > >
> > > Data Model: A new StorageConfiguration entity is created with its own
> > > unique identifier and lifecycle. Tables and Namespaces would store a
> > > reference ID pointing to this entity rather than embedding the
> > credentials
> > > directly.
> > > -
> > >
> > > Security: This model offers the cleanest security boundary. We can
> > > introduce a specific USAGE privilege on the configuration entity. A
> user
> > > would need both CREATE_TABLE on the Namespace *and* USAGE on the
> specific
> > > StorageConfiguration to link them.
> > > -
> > >
> > > Credential Rotation: Highly efficient. Because the configuration is
> > > referenced by ID, rotating a cloud IAM role or secret requires updating
> > > only the single StorageConfiguration entity. All thousands of
> downstream
> > > tables referencing it would immediately use the new credentials without
> > > metadata updates.
> > > -
> > >
> > > Inheritance: The reference could be set at the Catalog, Namespace, or
> > Table
> > > level. If a Table does not specify a reference, it would inherit the
> > > reference from its parent Namespace (and so on), preserving the current
> > > hierarchical behavior while adding granularity.
> > >
> > > • Pros: Maximum flexibility and reusability (Many-to-Many). Updating
> one
> > > config object propagates to all associated tables.
> > > -
> > >
> > > • Cons: Highest engineering cost. Requires new CRUD APIs, DB schema
> > changes
> > > (mapping tables), and complex authorization logic (two-stage auth
> > checks).
> > > Risk of accumulating "orphaned" configs
> > >
> > > Option 2: The "Embedded Field" Model
> > > -
> > >
> > > This approach extends the existing Table and Namespace entities to
> > include
> > > a storageConfig field. The parameter can be defaulted to 'null' and use
> > > parent's storageConfig at runtime.
> > >
> > > *Data Model:* No new top-level entity is created. The storage details
> > > (e.g., roleArn) are stored directly into a new, dedicated column or
> > > structure within the existing Table/Namespace entity.
> > >
> > > Complexity: This could reduce the engineering overhead significantly.
> > There
> > > are no new CRUD endpoints for configuration objects, no referential
> > > integrity checks (e.g., preventing the deletion of a config used by
> > active
> > > tables).
> > >
> > > Credential Rotation: Credential rotation is difficult. If an IAM role
> > > changes, an administrator must identify and issue UPDATE operations for
> > > every individual table or namespace that uses that specific
> > configuration,
> > > potentially affecting thousands of objects.
> > >
> > > • Pros: Lowest engineering cost. No new entities or complex mappings
> are
> > > required. Easy to reason about authorization (auth is tied strictly to
> > the
> > > entity).
> > >
> > > • Cons: No reusability. Configs must be duplicated across tables;
> > rotating
> > > credentials for 1,000 tables could require 1,000 update calls.
> > >
> > > Option 3: Named Catalog-Level Configurations (Hybrid)
> > >
> > > This can be a combination of Option1 and Option 2
> > > Admin can define a registry of "Named Storage Configurations" stored
> > within
> > > the Catalog. Sub-entities (Namespaces/Tables) reference these configs
> by
> > > name (e.g., storage-config: "finance-secure-role").
> > >
> > > *Data Model:* No separate top level entity is created. The Catalog
> Entity
> > > potentially needs to be modified to accommodate named storage
> > > configurations.
> > >
> > > Credential Rotation: Credential Rotation can be done at the catalog
> level
> > > for each named Storage Configuration.
> > >
> > > Inheritance: Works pretty much similar as proposed in option 1 &
> option2.
> > >
> > > Security: Not as secure as option1 but still useful. A principal with
> > > proper access can attach any named storage configuration defined at the
> > > catalog level to any arbitrary entity within the catalog.
> > >
> > > • Pros: Good balance of reusability and simplicity. Allows updating a
> > > config in one place (the Catalog definition) without needing a
> full-blown
> > > global entity system.
> > >
> > > • Cons: Scope is limited to the Catalog (cannot share configs across
> > > catalogs)
> > > Option 4: Leverage Existing Policy Framework
> > >
> > > This approach leverages the existing Apache Polaris Policy Framework
> > > (currently used for features like snapshot expiry) to manage storage
> > > settings.
> > >
> > > Data Model: Storage configurations are defined as "Policies" at the
> > Catalog
> > > level. These Policies contain the credential details and can be
> attached
> > to
> > > Namespaces or Tables using the existing policy attachment APIs.
> > >
> > > Inheritance:  This aligns naturally with Polaris's existing
> architecture,
> > > where policies cascade from Catalog → Namespace → Table. The vending
> > logic
> > > would simply resolve the "effective" storage policy for a table at
> query
> > > time.
> > >
> > > Security: This utilizes the existing Polaris Privileges and attachment
> > > privileges. Administrators can define authorized storage policies
> > > centrally, and users can only select from these pre-approved policies,
> > > preventing them from inputting arbitrary or insecure role ARNs.
> > >
> > > • Pros:
> > >   . Zero New Infrastructure: Reuses the existing "Policy" entity,
> > > persistence layer, and inheritance logic, significantly reducing
> > > engineering effort
> > >   . Proven Inheritance: The logic for resolving policies from child to
> > > parent is already implemented and tested
> > >
> > > • Cons:
> > >   . Semantic Confusion: Policies are typically used for "governance
> > rules"
> > > (e.g., snapshot expiry, compaction) rather than "connectivity
> > > configuration." Using them for credentials might be unintuitive
> > >   . Authorization Complexity: The authorizer would need to load and
> > > evaluate policies to determine how to access data, potentially coupling
> > > governance logic with data access paths
> > >
> > > We can potentially start with one of the options initially and as the
> > > feature and user needs develop we can migrate to other options as well.
> > > Please let me know your thoughts about the various options above or if
> on
> > > anything that I might have missed so that we can work towards a
> consensus
> > > on how to implement this feature.
> > >
> > >
> > > On Thu, Feb 5, 2026 at 8:08 AM Tornike Gurgenidze <
> > [email protected]>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > To follow up on Dmitri's point about credentials, there's already a
> PR
> > > > <https://github.com/apache/polaris/pull/3409> up that is going to
> > allow
> > > > predefining named storage credentials in polaris config like the
> > > following:
> > > >
> > > >    - polaris.storage.aws.<storage-name>.access-key
> > > >    - polaris.storage.aws.<storage-name>.secret-key
> > > >
> > > > then storage configuration will simply refer to it by name and
> > > > inherit credentials.
> > > >
> > > > I think that can go hand in hand with table-level overrides.
> Overriding
> > > > each and every aws property for every table doesn't sound ideal.
> > > Defining a
> > > > storage configuration upfront and referring to it by name should be a
> > > > simpler solution. I can extend the scope of the PR above to allow
> > > > predefining other aws properties as well like endpoint-url and
> region.
> > > >
> > > > Another point that came up in the discussion surrounding extra
> > > credentials
> > > > is how to make sure anyone can't just hijack pre configured
> > credentials.
> > > > The simplest solution I see there is to ship off properties to OPA
> > during
> > > > catalog (and table) creation and allow users to write policies based
> on
> > > > them. If we want to enable internal rbac to have a similar capability
> > we
> > > > can go further and move from config based storage definition to a
> > > separate
> > > > `/storage-config` rest resource in management API that will come with
> > > > necessary grants and permissions.
> > > >
> > > > On Thu, Feb 5, 2026 at 5:43 AM Dmitri Bourlatchkov <[email protected]
> >
> > > > wrote:
> > > >
> > > > > Hi Srinivas,
> > > > >
> > > > > Thanks for the proposal. It looks good to me overall, a very timely
> > > > feature
> > > > > to add to Polaris.
> > > > >
> > > > > I added some comments in the doc and I see this topic on the
> > Community
> > > > Sync
> > > > > agenda for Feb 5. Looking forward to discussing it online.
> > > > >
> > > > > I have three points to highlight:
> > > > >
> > > > > * Dealing with passwords probably connects to the Secrets Manager
> > > > > discussion [1]
> > > > >
> > > > > * Persistence needs to consider non-RDBMS backends. OSS code has
> both
> > > > > PostgreSQL and MongoDB, but private Persistence implementations are
> > > > > possible too. I believe we need a proper SPI for this, not just a
> > > > > relational schema example.
> > > > >
> > > > > * Associating entities (tables, namespaces) to Storage
> Configuration
> > is
> > > > > likely a plugin point that downstream projects may want to
> customize.
> > > I'd
> > > > > propose making another SPI for this. This SPI is probably different
> > > from
> > > > > the new Persistence SPI mentioned above since the concern here is
> not
> > > > > persistence per se, but the logic of finding the right storage
> > config.
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f
> > > > >
> > > > > Cheers,
> > > > > Dmitri.
> > > > >
> > > > > On Mon, Feb 2, 2026 at 4:18 PM Srinivas Rishindra <
> > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > We had an opportunity to discuss the community sprint last week.
> > > Based
> > > > on
> > > > > > that discussion, I have created a new design doc which I am
> > attaching
> > > > > here.
> > > > > > In this design instead of passing credentials via table
> properties,
> > > > this
> > > > > > design introduces Inheritable Storage Configurations as a
> > first-class
> > > > > > feature. Please let me know your thoughts on the document.
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1hbDkE-w84Pn_112iW2vCnlDKPDtyg8flaYcFGjvD120/edit?usp=sharing
> > > > > >
> > > > > >
> > > > > > On Mon, Jan 26, 2026 at 10:42 PM Yufei Gu <[email protected]>
> > > > wrote:
> > > > > >
> > > > > > > Hi Srinivas,
> > > > > > >
> > > > > > > Thanks for sharing this proposal. Persisting long lived
> > credentials
> > > > > such
> > > > > > as
> > > > > > > an S3 secret access key directly in table properties raises
> > > > significant
> > > > > > > security concerns. Here is an alternative approach previously
> > > > > discussed,
> > > > > > > which enables storage configuration at the table or namespace
> > > level,
> > > > > and
> > > > > > it
> > > > > > > is probably a more secure and promising direction overall.
> > > > > > >
> > > > > > > Yufei
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Jan 26, 2026 at 8:18 PM Srinivas Rishindra <
> > > > > > [email protected]
> > > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Dear All,
> > > > > > > >
> > > > > > > > I have developed a design proposal for Table-Level Storage
> > > > Credential
> > > > > > > > Overrides in Apache Polaris.
> > > > > > > >
> > > > > > > > The core objective is to allow specific storage properties to
> > be
> > > > > > defined
> > > > > > > at
> > > > > > > > the table level rather than the catalog level, enabling a
> > single
> > > > > > logical
> > > > > > > > catalog to support tables across disparate storage systems.
> > > > > Crucially,
> > > > > > > the
> > > > > > > > implementation ensures these overrides participate in the
> > > > credential
> > > > > > > > vending process to maintain secure, scoped access.
> > > > > > > >
> > > > > > > > I have also implemented a Proof of Concept (POC) pull request
> > to
> > > > > > > > demonstrate the idea. While the current MVP focuses on S3, I
> > > intend
> > > > > to
> > > > > > > > expand scope to include Azure and GCS pending community
> > feedback.
> > > > > > > >
> > > > > > > > I look forward to your thoughts and suggestions on this
> > proposal.
> > > > > > > >
> > > > > > > > Links:
> > > > > > > >
> > > > > > > > - Design Doc: Table-Level Storage Credential Overrides (
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tf4N8GKeyAAYNoP0FQ1zT1Ba3P1nVGgdw3nmnhSm-u0/edit?usp=sharing
> > > > > > > > )
> > > > > > > > - POC PR: https://github.com/apache/polaris/pull/3563 (
> > > > > > > > https://github.com/apache/polaris/pull/3563)
> > > > > > > >
> > > > > > > > Best regards,
> > > > > > > >
> > > > > > > > Srinivas Rishindra Pothireddi
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to