I’d suggest we start from concrete use cases. If the inline model(Option 2) works well for the primary scenarios, e.g., relatively sparse table level storage overrides, we could adopt it as a first phase. It keeps the implementation simple and lets us validate real world needs before introducing additional abstractions.
However, if we anticipate frequent configuration rotation or strong reuse requirements across many tables, Option 1 is more compelling. In that case, I'd recommend reusing the existing policy framework where possible, since it already provides inheritance and attachment semantics. That could help us avoid introducing significant new complexity into Polaris while still supporting the richer model. Yufei On Wed, Feb 11, 2026 at 9:12 AM Dmitri Bourlatchkov <[email protected]> wrote: > Hi Srinivas, > > Thanks for the discussion recap! It's very useful to keep the dev thread > and meetings aligned. > > Option 1: > Credential Rotation: Highly efficient. Because the configuration is > referenced by ID, rotating a cloud IAM role or secret requires updating > only the single StorageConfiguration entity. [...] > > > This seems to imply that credentials are stored as part of the Storage > Configuration Entity. If so, I do not think this approach is ideal. I > believe the secret data should ideally be accessed via the Secrets Manager > [1]. While that discussion is still in progress, I believe it interconnects > with this proposal. > > [...] All thousands of downstream > tables referencing it would immediately use the new credentials without > metadata updates. > > > Immediacy is probably from the end-user's perspective. Internally, > different Polaris processes may switch to the updated config at > different moments in time... I do not think it is a problem in this case, > just wanted to highlight it to make sure distributed system aspects are not > left out :) > > Option 2: > Credential Rotation: Credential rotation is difficult [...] > > > Again, I believe actual credentials should be accessed via the Secrets > Manager [1] so some indirection will be present. > > Config updates will need to happen individually in each case, but actual > secrets could be shared and updated centrally via the Secrets Manager. > > ATM, given the complexity points about option 1 that were brought up in the > community sync, I tend to favour this option for implementing this > proposal. However, this is not a strong requirement by any means, just my > personal opinion. Other opinions are welcome. > > Depending on how secret references are handled in code (needs a POC, I > guess), there could be some synergy with Tornike's approach from [3699]. > > Option 3: Named Catalog-Level Configurations (Hybrid) [...] > > > I would like to clarify the UX story in this case. Do we expect end users > to manage Storage Configuration in this case or the Polaris owner? > > In the latter case, it seems similar to Tornike's proposal in [3699] but > generalized to all storage types. The Polaris Admin / Owner could use a > non-public API to work with this configuration (e.g. plain Quarkus > configuration or possibly Admin CLI). > > Option 4: Leverage Existing Policy Framework [...] > > > I tend to agree with the "semantic confusion" point. > > It should be fine to reuse policy-related code in the implementation (if > possible), but I believe Storage Configuration and related credential > management form a distinct use case / feature and deserve dedicated > handling in Polaris and the API / UX level. > > [1] https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f > > [3699] https://github.com/apache/polaris/pull/3699 > > Thanks, > Dmitri. > > On Tue, Feb 10, 2026 at 10:19 PM Srinivas Rishindra < > [email protected]> > wrote: > > > Hi Everyone, > > > > We had an opportunity to discuss this feature and my recent proposal at > > the last community sync meeting. I would like to summarize our > discussion > > and enumerate the various options we considered to help us reach a > > consensus. > > > > To recap, storage configuration is currently restricted at the catalog > > level. This limits flexibility for users who need to organize tables > across > > different storage configurations or cloud providers within a single > > catalog. There appears to be general agreement on the utility of this > > feature; however, we still need to align on the specific implementation > > approach. > > > > Here are the various options that were considered. > > *Option 0: Make Credentials available as part of table properties. *(This > > was my original proposal, but abandoned after becoming aware of the > > security implications.) > > > > *Option 1: First-Class Storage Configuration Entity * > > > > This approach proposes elevating StorageConfiguration to a standalone, > > top-level resource in the Polaris backend (similar to a Principal, > > Namespace or Table), independent of the Catalog or Table. This is the > > approach in my most recent proposal doc. > > - > > > > Data Model: A new StorageConfiguration entity is created with its own > > unique identifier and lifecycle. Tables and Namespaces would store a > > reference ID pointing to this entity rather than embedding the > credentials > > directly. > > - > > > > Security: This model offers the cleanest security boundary. We can > > introduce a specific USAGE privilege on the configuration entity. A user > > would need both CREATE_TABLE on the Namespace *and* USAGE on the specific > > StorageConfiguration to link them. > > - > > > > Credential Rotation: Highly efficient. Because the configuration is > > referenced by ID, rotating a cloud IAM role or secret requires updating > > only the single StorageConfiguration entity. All thousands of downstream > > tables referencing it would immediately use the new credentials without > > metadata updates. > > - > > > > Inheritance: The reference could be set at the Catalog, Namespace, or > Table > > level. If a Table does not specify a reference, it would inherit the > > reference from its parent Namespace (and so on), preserving the current > > hierarchical behavior while adding granularity. > > > > • Pros: Maximum flexibility and reusability (Many-to-Many). Updating one > > config object propagates to all associated tables. > > - > > > > • Cons: Highest engineering cost. Requires new CRUD APIs, DB schema > changes > > (mapping tables), and complex authorization logic (two-stage auth > checks). > > Risk of accumulating "orphaned" configs > > > > Option 2: The "Embedded Field" Model > > - > > > > This approach extends the existing Table and Namespace entities to > include > > a storageConfig field. The parameter can be defaulted to 'null' and use > > parent's storageConfig at runtime. > > > > *Data Model:* No new top-level entity is created. The storage details > > (e.g., roleArn) are stored directly into a new, dedicated column or > > structure within the existing Table/Namespace entity. > > > > Complexity: This could reduce the engineering overhead significantly. > There > > are no new CRUD endpoints for configuration objects, no referential > > integrity checks (e.g., preventing the deletion of a config used by > active > > tables). > > > > Credential Rotation: Credential rotation is difficult. If an IAM role > > changes, an administrator must identify and issue UPDATE operations for > > every individual table or namespace that uses that specific > configuration, > > potentially affecting thousands of objects. > > > > • Pros: Lowest engineering cost. No new entities or complex mappings are > > required. Easy to reason about authorization (auth is tied strictly to > the > > entity). > > > > • Cons: No reusability. Configs must be duplicated across tables; > rotating > > credentials for 1,000 tables could require 1,000 update calls. > > > > Option 3: Named Catalog-Level Configurations (Hybrid) > > > > This can be a combination of Option1 and Option 2 > > Admin can define a registry of "Named Storage Configurations" stored > within > > the Catalog. Sub-entities (Namespaces/Tables) reference these configs by > > name (e.g., storage-config: "finance-secure-role"). > > > > *Data Model:* No separate top level entity is created. The Catalog Entity > > potentially needs to be modified to accommodate named storage > > configurations. > > > > Credential Rotation: Credential Rotation can be done at the catalog level > > for each named Storage Configuration. > > > > Inheritance: Works pretty much similar as proposed in option 1 & option2. > > > > Security: Not as secure as option1 but still useful. A principal with > > proper access can attach any named storage configuration defined at the > > catalog level to any arbitrary entity within the catalog. > > > > • Pros: Good balance of reusability and simplicity. Allows updating a > > config in one place (the Catalog definition) without needing a full-blown > > global entity system. > > > > • Cons: Scope is limited to the Catalog (cannot share configs across > > catalogs) > > Option 4: Leverage Existing Policy Framework > > > > This approach leverages the existing Apache Polaris Policy Framework > > (currently used for features like snapshot expiry) to manage storage > > settings. > > > > Data Model: Storage configurations are defined as "Policies" at the > Catalog > > level. These Policies contain the credential details and can be attached > to > > Namespaces or Tables using the existing policy attachment APIs. > > > > Inheritance: This aligns naturally with Polaris's existing architecture, > > where policies cascade from Catalog → Namespace → Table. The vending > logic > > would simply resolve the "effective" storage policy for a table at query > > time. > > > > Security: This utilizes the existing Polaris Privileges and attachment > > privileges. Administrators can define authorized storage policies > > centrally, and users can only select from these pre-approved policies, > > preventing them from inputting arbitrary or insecure role ARNs. > > > > • Pros: > > . Zero New Infrastructure: Reuses the existing "Policy" entity, > > persistence layer, and inheritance logic, significantly reducing > > engineering effort > > . Proven Inheritance: The logic for resolving policies from child to > > parent is already implemented and tested > > > > • Cons: > > . Semantic Confusion: Policies are typically used for "governance > rules" > > (e.g., snapshot expiry, compaction) rather than "connectivity > > configuration." Using them for credentials might be unintuitive > > . Authorization Complexity: The authorizer would need to load and > > evaluate policies to determine how to access data, potentially coupling > > governance logic with data access paths > > > > We can potentially start with one of the options initially and as the > > feature and user needs develop we can migrate to other options as well. > > Please let me know your thoughts about the various options above or if on > > anything that I might have missed so that we can work towards a consensus > > on how to implement this feature. > > > > > > On Thu, Feb 5, 2026 at 8:08 AM Tornike Gurgenidze < > [email protected]> > > wrote: > > > > > Hi, > > > > > > To follow up on Dmitri's point about credentials, there's already a PR > > > <https://github.com/apache/polaris/pull/3409> up that is going to > allow > > > predefining named storage credentials in polaris config like the > > following: > > > > > > - polaris.storage.aws.<storage-name>.access-key > > > - polaris.storage.aws.<storage-name>.secret-key > > > > > > then storage configuration will simply refer to it by name and > > > inherit credentials. > > > > > > I think that can go hand in hand with table-level overrides. Overriding > > > each and every aws property for every table doesn't sound ideal. > > Defining a > > > storage configuration upfront and referring to it by name should be a > > > simpler solution. I can extend the scope of the PR above to allow > > > predefining other aws properties as well like endpoint-url and region. > > > > > > Another point that came up in the discussion surrounding extra > > credentials > > > is how to make sure anyone can't just hijack pre configured > credentials. > > > The simplest solution I see there is to ship off properties to OPA > during > > > catalog (and table) creation and allow users to write policies based on > > > them. If we want to enable internal rbac to have a similar capability > we > > > can go further and move from config based storage definition to a > > separate > > > `/storage-config` rest resource in management API that will come with > > > necessary grants and permissions. > > > > > > On Thu, Feb 5, 2026 at 5:43 AM Dmitri Bourlatchkov <[email protected]> > > > wrote: > > > > > > > Hi Srinivas, > > > > > > > > Thanks for the proposal. It looks good to me overall, a very timely > > > feature > > > > to add to Polaris. > > > > > > > > I added some comments in the doc and I see this topic on the > Community > > > Sync > > > > agenda for Feb 5. Looking forward to discussing it online. > > > > > > > > I have three points to highlight: > > > > > > > > * Dealing with passwords probably connects to the Secrets Manager > > > > discussion [1] > > > > > > > > * Persistence needs to consider non-RDBMS backends. OSS code has both > > > > PostgreSQL and MongoDB, but private Persistence implementations are > > > > possible too. I believe we need a proper SPI for this, not just a > > > > relational schema example. > > > > > > > > * Associating entities (tables, namespaces) to Storage Configuration > is > > > > likely a plugin point that downstream projects may want to customize. > > I'd > > > > propose making another SPI for this. This SPI is probably different > > from > > > > the new Persistence SPI mentioned above since the concern here is not > > > > persistence per se, but the logic of finding the right storage > config. > > > > > > > > [1] https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f > > > > > > > > Cheers, > > > > Dmitri. > > > > > > > > On Mon, Feb 2, 2026 at 4:18 PM Srinivas Rishindra < > > > [email protected]> > > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > We had an opportunity to discuss the community sprint last week. > > Based > > > on > > > > > that discussion, I have created a new design doc which I am > attaching > > > > here. > > > > > In this design instead of passing credentials via table properties, > > > this > > > > > design introduces Inheritable Storage Configurations as a > first-class > > > > > feature. Please let me know your thoughts on the document. > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1hbDkE-w84Pn_112iW2vCnlDKPDtyg8flaYcFGjvD120/edit?usp=sharing > > > > > > > > > > > > > > > On Mon, Jan 26, 2026 at 10:42 PM Yufei Gu <[email protected]> > > > wrote: > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > Thanks for sharing this proposal. Persisting long lived > credentials > > > > such > > > > > as > > > > > > an S3 secret access key directly in table properties raises > > > significant > > > > > > security concerns. Here is an alternative approach previously > > > > discussed, > > > > > > which enables storage configuration at the table or namespace > > level, > > > > and > > > > > it > > > > > > is probably a more secure and promising direction overall. > > > > > > > > > > > > Yufei > > > > > > > > > > > > > > > > > > On Mon, Jan 26, 2026 at 8:18 PM Srinivas Rishindra < > > > > > [email protected] > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Dear All, > > > > > > > > > > > > > > I have developed a design proposal for Table-Level Storage > > > Credential > > > > > > > Overrides in Apache Polaris. > > > > > > > > > > > > > > The core objective is to allow specific storage properties to > be > > > > > defined > > > > > > at > > > > > > > the table level rather than the catalog level, enabling a > single > > > > > logical > > > > > > > catalog to support tables across disparate storage systems. > > > > Crucially, > > > > > > the > > > > > > > implementation ensures these overrides participate in the > > > credential > > > > > > > vending process to maintain secure, scoped access. > > > > > > > > > > > > > > I have also implemented a Proof of Concept (POC) pull request > to > > > > > > > demonstrate the idea. While the current MVP focuses on S3, I > > intend > > > > to > > > > > > > expand scope to include Azure and GCS pending community > feedback. > > > > > > > > > > > > > > I look forward to your thoughts and suggestions on this > proposal. > > > > > > > > > > > > > > Links: > > > > > > > > > > > > > > - Design Doc: Table-Level Storage Credential Overrides ( > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1tf4N8GKeyAAYNoP0FQ1zT1Ba3P1nVGgdw3nmnhSm-u0/edit?usp=sharing > > > > > > > ) > > > > > > > - POC PR: https://github.com/apache/polaris/pull/3563 ( > > > > > > > https://github.com/apache/polaris/pull/3563) > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > Srinivas Rishindra Pothireddi > > > > > > > > > > > > > > > > > > > > > > > > > > > >
