Hi community, Thanks for the inputs during the catalog sync! I want to summarize the decisions and direction that was agreed on during the sync.
*Direction* - We'll introduce a storage-refresh-token concept that integrates with the existing StorageCredential mechanism rather than being a staging-specific construct. This keeps the design reusable across different APIs going forward. - We agreed not to model this after the planId-based credential vending used in scan planning. The community is open to refactoring planId credential refresh to use the storage credential refresh token pattern in the future. *Discarded approaches* 1. table-uuid as the identifier - overloads a spec-level identifier for a purpose it wasn't designed for 2. Server-side state / sessions - adds operational complexity and some existing catalog implementations assume stateless staged table creation 3. Overloading OAuth scopes - conflates storage credential refresh with the OAuth layer I will share an updated design doc and spec PR reflecting this direction. On Tue, Feb 10, 2026 at 11:14 AM Maninder Parmar < [email protected]> wrote: > Thanks for reviewing the proposal Huaxin! > > *"Since stagingSession is in the URL and may show up in logs, should it be > treated as a secret token (hard to guess, short expiry)?"* > No, stagingSession is not a secret it is just an identifier for the > session. It is up to the catalog server implementation if it wants to > enforce if only the user who was issued the stagingSession or any user > with staginSession should call commit on the table. It can use existing > authentication mechanisms to enforce those constraints. > > *"If it leaks, can someone else use it, or is it restricted to the same > user/job that created the staged table?"* > Since it's not a secret but merely an identifier (just like planId) there > should not be a risk of leak. It's up to catalog server implementation to > restrict same user/job or not. > > > *"What happens if a CTAS job crashes or is cancelled after staging? Does > the stagingSession expire automatically, and is there a way to clean > up/abort the staged create?"*The lifecycle implementation of > stagingSession is up to the catalog servers. There are multiple strategies > that could be used here like automatically expiring the session after a few > hours if no updateTable call was made for that session or expiring active > sessions when one of them is committed etc. > There would not be any additional API surface area exposed to clients to > manage the session lifecycle, it is the responsibility of the catalog > server. > > Let me know if you have follow up questions. > > > On Mon, Feb 9, 2026 at 7:07 PM huaxin gao <[email protected]> wrote: > >> Hi Maninder, >> >> Thanks for the proposal! It sounds like a good direction to me. Returning >> a stagingSession from stage-create and then reusing it for >> loadCredentials/loadTable feels consistent with the existing planId >> pattern, and it fixes a real CTAS problem. >> >> A few questions: >> >> Since stagingSession is in the URL and may show up in logs, should it be >> treated as a secret token (hard to guess, short expiry)? >> >> If it leaks, can someone else use it, or is it restricted to the same >> user/job that created the staged table? >> >> What happens if a CTAS job crashes or is cancelled after staging? Does >> the stagingSession expire automatically, and is there a way to clean >> up/abort the staged create? >> >> Would love to hear your thoughts on these. >> >> Thanks, >> >> Huaxin >> >> On Mon, Feb 9, 2026 at 4:30 PM Maninder Parmar < >> [email protected]> wrote: >> >>> Hello iceberg community! >>> >>> I wanted to discuss the proposal for refreshing storage credentials for >>> staged table creation. The iceberg tables could be created either via >>> single step creation flow or a two step staged creation flow which is used >>> for implementing CTAS (Create table as select) statements. Currently, it's >>> not possible to refresh the credentials for staged tables since they are >>> not committed on the catalog and hence not visible to loadTable or >>> credential endpoint. >>> There has been prior discussion >>> <https://lists.apache.org/thread/q5n355d89nxbhywtlv3qhq7dchbyb67d> where >>> the community members have expressed the need for supporting this scenario. >>> >>> I have started a proposal >>> <https://docs.google.com/document/d/1R1K6X7qYqvIFkPG3m1neV5Mvy8rwWJvhSFr8DgJgQ-E/edit?tab=t.0> >>> to >>> flush out the details to support this scenario building on the >>> precedence of credential vending support for scan planning. >>> The OpenAPI changes can be seen in PR #15280 >>> <https://github.com/apache/iceberg/pull/15280> >>> >>> Looking forward to your feedback. >>> >>> Thanks, >>> Maninder >>> >>> Proposal: Credential Refresh for Staged Table Creation >>> <https://drive.google.com/open?id=1R1K6X7qYqvIFkPG3m1neV5Mvy8rwWJvhSFr8DgJgQ-E> >>> >>
