Thanks for driving this effort Varun, I think the new change introducing StorageLocationPreparer is an interesting idea. This would allow Polaris to check whether storage is ready before table creation, which is a feature gap today. I think it will work well with managed storage solutions that require some setup before data is written to the location.
Cheers, Sung On 2026/04/20 18:51:32 Dmitri Bourlatchkov wrote: > Hi All, > > Heads up: we discussed this PR in the Community Sync last week. Please > review actual changes in GH and provide feedback. > > Thanks, > Dmitri. > > On Thu, Apr 2, 2026 at 1:46 PM Dmitri Bourlatchkov <[email protected]> wrote: > > > HI Varun, > > > > Thanks for your contribution! I think it's quite timely as we have some > > recent user interest in the community [4106]. > > > > Your design LGTM overall. I'll try to have a full review of the PR soon. > > > > Note for other reviewers: This feature involves a minor Polaris REST API > > change. > > > > Should the flag default to auto-detection in the future (e.g., > > checking if the bucket has HNS enabled at catalog creation time), or is > > an > > explicit opt-in flag the right long-term approach? > > > > > > Auto-detection is a nice to have feature, but I'm not sure it's worth the > > added complexity on the Polaris side. > > > > I believe (admin) users are generally aware of the HNS state of the bucket > > they use for a catalog, so having a HNS flag in the Storage Configuration > > is probably not burdensome. At least this is quite acceptable for the > > initial PR. Note that Azure has a similar flag in its Storage Configuration > > [3347]. > > > > This is just my personal opinion. If you feel like adding auto-detection > > in a follow-up PR, by all means please feel free to contribute that too. > > > > [3347] https://github.com/apache/polaris/pull/3347 > > > > [4106] https://github.com/apache/polaris/pull/4106 > > > > Cheers, > > Dmitri. > > > > On Thu, Apr 2, 2026 at 2:27 AM Varun Arya <[email protected]> wrote: > > > >> Hi Polaris community, > >> I'd like to start a discussion around a change I've been working on to add > >> support for GCS buckets with > >> https://cloud.google.com/storage/docs/hns-overview enabled. > >> > >> > >> > >> *Problem* > >> When a GCS bucket has HNS enabled, folders must be created explicitly as > >> managed folders. The current credential vending logic only grants > >> `roles/storage.legacyObjectReader`, > >> `roles/storage.objectViewer`, and `roles/storage.legacyBucketWriter` which > >> is insufficient for HNS buckets. Operations that need to create folders > >> (e.g., Iceberg writing data/metadata) fail with 403 errors because the > >> downscoped token lacks `storage.managedFolders.create` permission. > >> > >> > >> > >> *Proposed Solution* > >> I've added a new optional hierarchicalNamespace boolean flag to > >> GcpStorageConfigInfo in the management API. When set to true, the > >> credential vending logic generates an additional access boundary rule > >> granting `roles/storage.folderAdmin` scoped to the specific write paths > >> using > >> resource.name.startsWith('projects/_/buckets/<bucket>/managedFolders/<path>') > >> conditions. > >> > >> *Key design decisions*: > >> > >> 1. Opt-in flag: HNS support is not enabled by default. Users must > >> explicitly set hierarchicalNamespace: true in their catalog storage > >> configuration. This avoids granting unnecessary permissions on non-HNS > >> buckets. > >> 2. Least-privilege scoping: `roles/storage.folderAdmin` is the > >> least-privileged predefined GCP role that includes > >> storage.managedFolders.create. The access boundary condition expression > >> limits scope to the specific write paths only. > >> 3. Write-path only: Folder management rules are only generated for > >> write > >> locations. Read-only access does not get folderAdmin permissions. > >> 4. Multi-bucket support: The implementation handles cases where > >> metadata > >> and data reside in separate buckets, generating correctly scoped > >> folderAdmin rules for each. > >> > >> *Open Questions* > >> > >> 1. Should the flag default to auto-detection in the future (e.g., > >> checking if the bucket has HNS enabled at catalog creation time), or > >> is an > >> explicit opt-in flag the right long-term approach? > >> > >> PR link: Enable HNS support for GCS > >> <https://github.com/apache/polaris/pull/3996> > >> > >> Thanks, > >> Varun Arya > >> > > >
