TL;DR: I agree with needing to fix the interface decomposition and that it
shouldn't be too hard. I could go either way on whether it's done in the
existing PR though -- we probably don't want to introduce yet another
pattern if we might forget to refactor the existing MetricsPersistence and
PolicyMappingPersistence to match the same pattern. Re: coupling, I'm not
sure they're entirely "unrelated" but the coupling should be expressed
differently than making BasePersistence a big interface-aggregator anyways,
so I still agree with the refactor.

Regarding the IdempotencyPersistence/BasePersistence hierarchy, though I
don't feel too strongly either way in the short term for expediency (since
internal layout consistency is generally good, and proper interface
decomposition could be done across all the "modular" *Persistence
interfaces at once in a consistent way), I do agree with Dmitri's point
that right now making the "BasePersistence" interface be an aggregator
interface that is directly used at callsites feels like we're sort of stuck
in a one-foot-in-the-door state in terms of interface decomposition.

It seems the splitting of factory and accessor methods could potentially be
lightweight initially, since as I understand it, our first step would just
be to push down the multiple-interface-inheritance into the concrete impls
like "JdbcBasePersistenceImpl implements BasePersistence,
MetricsPersistence, PolicyMappingPersistence, IdempotencyPersistence,
IntegrationPersistence".

At a glance, it looks like we'd need to split out some
"getOrCreateMetricsPersistence", "getOrCreateIdempotencyPersistence", etc.,
in MetaStoreManagerFactory and probably also mirror that set of accessors
into PolarisCallContext alongside the current "BasePersistence
getMetaStore()".

It would probably also be a good time to rename the "getOrCreateSession"
method to better capture what it's doing. And also rename
"PolarisCallContext.getMetaStore()".

I remember this decomposition was the plan back when I split
BasePersistence vs IntegrationPersistence already -- ideally the
introduction of PolicyMappingPersistence and MetricsPersistence should've
followed the pattern of BasePersistence vs IntegrationPersistence already
(though we didn't quite finish pulling IntegrationPersistence into the
factory methods, instead just using a blind cast of
((IntegrationPersistence) ms) in the relevant callsites as a "temporary"
placeholder that accidentally became permanent). Basically impls could
always multi-inherit BasePersistence and FooPersistence, etc., but
BasePersistence itself did *not* directly extend IntegrationPersistence.

There are some caveats though:

So, if we can agree
> that IdempotencyPersistence is not conceptually related to Metastore
> Persistence (my points above) then introducing tight coupling between
> BasePersistence and IdempotencyPersistence is pure tech debt.


I'm not sure I agree they're not conceptually related, though maybe I need
to deep dive review the overall design. I agree that "coincidental" tight
coupling doesn't help anything (so yes we should decompose the interfaces),
but I think we need to define exactly what type of coupling we intend to
support the correct semantics.

The main thing: These idempotency keys are fundamentally to support
ACID-related semantics, unlike best-effort caching types of use cases.
Thus, the existence or nonexistence of idempotency records must be coupled
in *some way* to their associated request "side effects", which in this
case means core metastore persistence state.

If we extend the original transactionality semantics, this is actually
somewhat well defined:

1. TransactionalPersistence will continue to be an aggregator of
BasePersistence and IdempotencyPersistence, with the implication that its
own "runInTransaction" block covers both the BasePersistence and
IdempotencyPersistence calls in a single atomic unit.
2. AtomicOperationMetaStoreManager intentionally does *not* hold a
TransactionalPersistence, so we know that either we're sacrificing certain
race-condition cases without transactions, *or* we define some higher-level
"loose coupling" that enables a multi-phase commit semantic to ensure
cohesive state between PolarisEntity and IdempotencyRecords.

If we wanted to be more "formal" with that "loose coupling" it might mean
adding a different interface like
"NonTransactionalIdempotencyToEntityCoordinator" with methods like
"PolarisEntity addIdempotencyBookkeepingToEntityBeforeChange(IdempotencyKey
key, PolarisEntity entityBeingChanged)" and maybe different
postEntityCommit or abortEntityCommitForIdempotencyKey methods related to
the multi-phase commit handling.


On Wed, May 6, 2026 at 2:27 PM Dmitri Bourlatchkov <[email protected]> wrote:

> Hi All,
>
> As far as I can tell, what I proposed in my previous email on this thread
> aligns with the SPI evolution direction in Metrics Persistence [1].
>
> [1] https://lists.apache.org/thread/7kp36t8fykg2hvzkr96r5m7cdnnyopj9
>
> Cheers,
> Dmitri.
>
> On Wed, May 6, 2026 at 4:55 PM Dmitri Bourlatchkov <[email protected]>
> wrote:
>
> > Hi Huaxin,
> >
> > I agree with your analysis. However, regarding the design direction part,
> > from my POV, it would be preferable to introduce a parallel factory
> > dedicated to IdempotencyPersistence.
> >
> > Idempotency Persistence is not conceptually connected to Metastore
> > Persistence.
> >
> > Therefore, a BasePersistence (Metastore) implementation using a
> particular
> > database technology does not have to implement IdempotencyPersistence
> using
> > the same technology.
> >
> > Moreover, it is conceivable that a Polaris admin may want to use
> different
> > database technologies for IdempotencyPersistence and Metastore
> Persistence.
> > I'd even suggest that this approach may be advisable because the
> > performance and data consistency characteristics expected
> > of IdempotencyPersistence and the Metastore are very different due to
> their
> > distinct use cases.
> >
> > When SPIs are separated, a particular implementation of Metastore
> > Persistence can also implement IdempotencyPersistence and vice versa.
> >
> > Leaving `default` methods unimplemented (current state of [4269]) may
> > permit some Metastore implemetation to ignore the IdempotencyPersistence
> > SPI, but in that situation the Polaris admin user would have no option to
> > support IdempotencyPersistence via a different technology.
> >
> > All in all, from my POV I see no disadvantage to isolating these SPIs,
> > whereas I do see disadvantages in their coupling.
> >
> > This is also the direction I'd advocate in a more general Polaris SPI
> > discussion (which has not fully started on `dev` yet).
> >
> > Re: PR [4269] specifically, I believe isolating IdempotencyPersistence
> > from BasePersistence is not technically difficult. So, if we can agree
> > that IdempotencyPersistence is not conceptually related to Metastore
> > Persistence (my points above) then introducing tight coupling between
> > BasePersistence and IdempotencyPersistence is pure tech debt. Instead of
> > adding this coupling only to remove it in a follow-up PR, I believe it
> > would require less work to restructure the code and avoid introducing the
> > coupling upfront.
> >
> > [4269] https://github.com/apache/polaris/pull/4269
> >
> > Cheers,
> > Dmitri.
> >
> > On Tue, May 5, 2026 at 7:45 PM huaxin gao <[email protected]>
> wrote:
> >
> >> Hi Dimitri,
> >>
> >> Thanks for raising this. I'd prefer email so it's async and easier for
> >> everyone to participate at their own pace. A dedicated sync can come
> later
> >> if email gets stuck.
> >>
> >> My understanding is that BasePersistence already extends
> >> PolicyMappingPersistence and MetricsPersistence, plus has event
> >> persistence
> >> folded in directly. That's because AtomicOperationMetaStoreManager
> >> operates
> >> against a single BasePersistence instance per realm, and each backend
> >> (in-memory, JDBC, NoSQL) has one factory chain producing it. So to
> follow
> >> the current architecture, we need to make BasePersistence extend
> >> IdempotencyPersistence too. Splitting IdempotencyPersistence out as a
> >> top-level SPI would mean adding parallel factory plumbing in every
> >> backend.
> >>
> >> Happy to discuss the broader SPI shape on this thread. I'd prefer to
> land
> >> #4269 with the current shape since it follows the agreed-upon
> >> architecture,
> >> and revisit the layering uniformly in a follow-up once we converge —
> that
> >> broader discussion may take a while.
> >>
> >> Thanks,
> >>
> >> Huaxin
> >>
> >> On Tue, May 5, 2026 at 2:29 PM Dmitri Bourlatchkov <[email protected]>
> >> wrote:
> >>
> >> > Hi Huaxin, Yufei,
> >> >
> >> > I left some comments on [4269] mostly about SPI design concerns.
> >> >
> >> > Since we wanted to reopen a more general SPI discussion anyway, it
> >> might be
> >> > worth having a dedicated sync about that...
> >> >
> >> > Alternatively, we could have that discussion by email now, if you
> >> prefer. I
> >> > have some starting points to share for discussion.
> >> >
> >> > WDYT?
> >> >
> >> > [4269] https://github.com/apache/polaris/pull/4269
> >> >
> >> > Thanks,
> >> > Dmitri.
> >> >
> >> > On Fri, Apr 17, 2026 at 7:07 PM huaxin gao <[email protected]>
> >> wrote:
> >> >
> >> > > Hi Robert,
> >> > >
> >> > > Thanks, and I’m glad we’re aligned on the handler-level direction. I
> >> also
> >> > > appreciate the detailed review and feedback throughout this
> >> discussion.
> >> > >
> >> > > Since the old implementation is going away and the handler-level
> >> > > implementation will be in a fresh PR, would you be OK removing the
> -1
> >> > from
> >> > > PR #3803 <https://github.com/apache/polaris/pull/3803>? If so, I’ll
> >> > close
> >> > > the old PR and open a new one for the handler-level approach.
> >> > >
> >> > > Best,
> >> > >
> >> > > Huaxin
> >> > >
> >> > > On Thu, Apr 16, 2026 at 12:57 AM Robert Stupp <[email protected]>
> wrote:
> >> > >
> >> > > > i Yufei, Huaxin,
> >> > > >
> >> > > > Thanks for the thoughtful replies, and for opening the issue for
> >> > > > `createTableStaged` [1]. I’m glad we’re converging on the
> >> handler-level
> >> > > > direction.
> >> > > >
> >> > > > I agree benchmarking can run in parallel with implementation. For
> my
> >> > own
> >> > > > review, I’d prefer to wait until reproducible benchmark results
> are
> >> > > posted
> >> > > > (using the scope from my April 14 note, or with any scope changes
> >> > called
> >> > > > out).
> >> > > >
> >> > > > Happy to review once that is available.
> >> > > >
> >> > > > Best,
> >> > > > Robert
> >> > > >
> >> > > > [1] https://github.com/apache/polaris/issues/4200
> >> > > >
> >> > > >
> >> > > > On Tue, Apr 14, 2026 at 7:54 PM huaxin gao <
> [email protected]>
> >> > > wrote:
> >> > > >
> >> > > > > Hi Robert,
> >> > > > >
> >> > > > > Thanks for the feedback. Glad the handler-level design addresses
> >> the
> >> > > > > concerns.
> >> > > > >
> >> > > > > I also feel we should converge on the design and implementation
> >> > first,
> >> > > > and
> >> > > > > run benchmarks in parallel. The access patterns will be
> different
> >> > with
> >> > > > the
> >> > > > > handler-level approach, so benchmarking the current filter-based
> >> code
> >> > > > > wouldn't be representative.
> >> > > > >
> >> > > > > I'll open a GitHub issue to track createTableStaged support, and
> >> I'll
> >> > > > open
> >> > > > > a fresh PR for the handler-level implementation.
> >> > > > >
> >> > > > > Best,
> >> > > > >
> >> > > > > Huaxin
> >> > > > >
> >> > > > > On Tue, Apr 14, 2026 at 10:30 AM Yufei Gu <[email protected]
> >
> >> > > wrote:
> >> > > > >
> >> > > > > > Hi Robert,
> >> > > > > >
> >> > > > > > Thanks for the thorough review and for clearly laying out the
> >> > > concerns,
> >> > > > > > this is very helpful.
> >> > > > > >
> >> > > > > > For point 1(Q3), I do not think this needs to be a hard
> >> > prerequisite
> >> > > > > before
> >> > > > > > looking at the implementation.
> >> > > > > >
> >> > > > > > The handler level design and correctness concerns are largely
> >> > > > independent
> >> > > > > > from the exact performance characteristics. It would be more
> >> > > productive
> >> > > > > to
> >> > > > > > first converge on the right architecture, then validate and
> >> iterate
> >> > > on
> >> > > > > > performance with concrete measurements. Also, while the store
> is
> >> > > > reused,
> >> > > > > > the access pattern and critical sections change with the
> handler
> >> > > level
> >> > > > > > approach. Measuring the current implementation in isolation
> may
> >> not
> >> > > > > reflect
> >> > > > > > the actual behavior of the final design.
> >> > > > > >
> >> > > > > > We already have a similar situation with the NoSQL benchmark,
> >> where
> >> > > the
> >> > > > > > code has been merged for a while but the benchmarking work is
> >> still
> >> > > in
> >> > > > > > progress. That has not blocked design or code review, and we
> are
> >> > > > > iterating
> >> > > > > > on performance as we go.
> >> > > > > >
> >> > > > > > I agree that realistic benchmarking is important. But I would
> >> > suggest
> >> > > > we
> >> > > > > > treat this as a parallel track, not a gate. Otherwise we risk
> >> > > blocking
> >> > > > > > progress on the design without knowing whether the measured
> >> > overhead
> >> > > is
> >> > > > > > even representative of the final system.
> >> > > > > >
> >> > > > > > Happy to help define a benchmark plan once the design is
> >> settled.
> >> > > > > >
> >> > > > > > Yufei
> >> > > > > >
> >> > > > > >
> >> > > > > > On Tue, Apr 14, 2026 at 6:28 AM Robert Stupp <[email protected]>
> >> > wrote:
> >> > > > > >
> >> > > > > > > Hi Huaxin,
> >> > > > > > >
> >> > > > > > > yeah, the handler-level approach is the right call, I'm glad
> >> you
> >> > > went
> >> > > > > > with
> >> > > > > > > it.
> >> > > > > > >
> >> > > > > > > To be explicit about what your proposal aims to resolve:
> >> > > > > > >
> >> > > > > > > - Authorization bypass is fixed. The check now runs after
> >> > > > > resource-level
> >> > > > > > > authorization.
> >> > > > > > > - No credentials in the database. Fresh credentials
> re-vended
> >> on
> >> > > > replay
> >> > > > > > via
> >> > > > > > > the normal vending path.
> >> > > > > > > - Phase 2 architecture problem gone. The handler has the
> >> > principal,
> >> > > > > > storage
> >> > > > > > > config, and IAM access that the filter never could have.
> >> > > > > > > - Caller identity is in the binding.
> >> > > > > > >
> >> > > > > > > Three things still need adressing:
> >> > > > > > >
> >> > > > > > > First, and this is a hard prerequisite before I look at any
> >> > > > > > implementation
> >> > > > > > > code: Q3 from April 5 is still open. The store is being
> reused
> >> > > as-is,
> >> > > > > so
> >> > > > > > > the overhead can be measured right now without new handler
> >> code.
> >> > 3+
> >> > > > DB
> >> > > > > > > round-trips per mutation on the catalog connection pool,
> >> > write-hot
> >> > > > > table,
> >> > > > > > > polling loops under duplicate traffic - those numbers need
> to
> >> > come
> >> > > > > from a
> >> > > > > > > realistic setup: networked database under concurrent load,
> >> not a
> >> > > > local
> >> > > > > > > Postgres. The latency concerns are specifically about Aurora
> >> and
> >> > > > > > > CockroachDB, and the polling storm only shows up under
> >> concurrent
> >> > > > > > duplicate
> >> > > > > > > traffic. Please share the methodology so the results are
> >> > > > reproducible.
> >> > > > > > >
> >> > > > > > > Second: please open a GitHub issue for the createTableStaged
> >> > > > deferral.
> >> > > > > > >
> >> > > > > > > Third: since the filter is going away entirely and the
> >> critical
> >> > > > > sections
> >> > > > > > > move to the handlers, does it make sense to close #3803 and
> >> open
> >> > a
> >> > > > > fresh
> >> > > > > > > PR? The old diff is mostly code that's being deleted, and it
> >> > > carries
> >> > > > a
> >> > > > > > lot
> >> > > > > > > of review history. A clean PR would be easier for everyone
> to
> >> > > review
> >> > > > -
> >> > > > > > just
> >> > > > > > > reference the old one for context.
> >> > > > > > >
> >> > > > > > > Happy to do a fresh review once the new implementation is
> up.
> >> > > > > > >
> >> > > > > > > Best,
> >> > > > > > > Robert
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Tue, Apr 14, 2026 at 1:44 AM huaxin gao <
> >> > [email protected]
> >> > > >
> >> > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hi Robert,
> >> > > > > > > >
> >> > > > > > > > Here is a revised design proposal for the idempotency
> replay
> >> > > > > mechanism.
> >> > > > > > > I'd
> >> > > > > > > > appreciate your feedback. Thanks!
> >> > > > > > > >
> >> > > > > > > > *Moving idempotency to handler level*
> >> > > > > > > >
> >> > > > > > > > The idempotency check moves from the HTTP filter into the
> >> > handler
> >> > > > > > > methods,
> >> > > > > > > > after authorization. The filter is removed. The caller's
> >> > > principal
> >> > > > > hash
> >> > > > > > > is
> >> > > > > > > > included in the idempotency binding and checked on replay.
> >> > > > > > > >
> >> > > > > > > > *Credential-bearing mutations (createTable)*
> >> > > > > > > >
> >> > > > > > > > On replay, instead of returning a stored response, the
> >> handler
> >> > > > calls
> >> > > > > > > > buildLoadTableResponseWithDelegationCredentials with the
> >> > existing
> >> > > > > > table's
> >> > > > > > > > metadata. This vends fresh credentials for the current
> >> caller
> >> > > > through
> >> > > > > > the
> >> > > > > > > > normal credential vending path. No credentials are stored
> in
> >> > the
> >> > > > > > > database —
> >> > > > > > > > the idempotency record only stores the key, principal
> hash,
> >> > table
> >> > > > > > > > identifier, and status code.
> >> > > > > > > >
> >> > > > > > > > *Non-credential mutations (dropTable, renameTable,
> >> > namespace/view
> >> > > > > > > > operations)*
> >> > > > > > > >
> >> > > > > > > > These do not vend credentials, so there is no leakage
> >> concern.
> >> > On
> >> > > > > > replay,
> >> > > > > > > > the handler returns the stored status code and minimal
> >> response
> >> > > > body.
> >> > > > > > > >
> >> > > > > > > > *createTableStaged*
> >> > > > > > > >
> >> > > > > > > > Staged tables are not persisted to the catalog, so the
> >> method
> >> > can
> >> > > > > > safely
> >> > > > > > > > re-run on retry. Idempotency support for this endpoint can
> >> be
> >> > > > > deferred.
> >> > > > > > > >
> >> > > > > > > > *What is reused from the current implementation*
> >> > > > > > > >
> >> > > > > > > > The IdempotencyStore interface, database schema,
> >> configuration,
> >> > > and
> >> > > > > > > > background purge logic are reused. The main change is
> moving
> >> > the
> >> > > > > > > > check/finalize logic from the filter into the handler.
> >> > > > > > > >
> >> > > > > > > > Best,
> >> > > > > > > >
> >> > > > > > > > Huaxin
> >> > > > > > > >
> >> > > > > > > > On Sun, Apr 5, 2026 at 11:53 AM huaxin gao <
> >> > > [email protected]
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Hi Robert,
> >> > > > > > > > >
> >> > > > > > > > > Thanks for the follow-up.
> >> > > > > > > > >
> >> > > > > > > > > I appreciate the clarification. I read the listed items
> as
> >> > fix
> >> > > > > > options
> >> > > > > > > > and
> >> > > > > > > > > picked option 2, and the Phase 1/Phase 2 split was
> purely
> >> to
> >> > > keep
> >> > > > > the
> >> > > > > > > PR
> >> > > > > > > > at
> >> > > > > > > > > a reviewable size, not a design choice.
> >> > > > > > > > >
> >> > > > > > > > > I agree that the caller's identity should be part of the
> >> > > binding.
> >> > > > > > That
> >> > > > > > > is
> >> > > > > > > > > straightforward to add.
> >> > > > > > > > >
> >> > > > > > > > > On the architecture question, whether re-vending can
> work
> >> > > within
> >> > > > > the
> >> > > > > > > > > filter or needs a handler-level approach, I want to give
> >> this
> >> > > > > proper
> >> > > > > > > > > thought rather than rush an answer. I may not be able to
> >> get
> >> > > back
> >> > > > > to
> >> > > > > > > you
> >> > > > > > > > on
> >> > > > > > > > > this until next week (the week of 4/13).
> >> > > > > > > > >
> >> > > > > > > > > Thanks,
> >> > > > > > > > >
> >> > > > > > > > > Huaxin
> >> > > > > > > > >
> >> > > > > > > > > On Sat, Apr 4, 2026 at 11:42 PM Robert Stupp <
> >> [email protected]
> >> > >
> >> > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > >> Hi Huaxin,
> >> > > > > > > > >>
> >> > > > > > > > >> Thanks for responding. A few corrections and
> >> clarifications.
> >> > > > > > > > >>
> >> > > > > > > > >> My PR review described three consequences of the
> missing
> >> > > caller
> >> > > > > > > > >> identity binding and the credential leakage attack
> >> vector.
> >> > > Those
> >> > > > > > > > >> were not suggestions for a Phase 1/Phase 2 split. The
> >> > offline
> >> > > > > > > > >> decision to split into phases, and what each phase
> >> contains,
> >> > > was
> >> > > > > > > > >> yours [1].
> >> > > > > > > > >>
> >> > > > > > > > >> On the phasing: the concern is not PR size. The concern
> >> is
> >> > > that
> >> > > > > > > > >> Phase 1 alone ships a full authorization bypass. The
> >> filter
> >> > > > > replays
> >> > > > > > > > >> cached responses before Polaris's resource-level
> >> > authorization
> >> > > > > > > > >> runs. Stripping credentials does not fix that. Any
> >> > > authenticated
> >> > > > > > > > >> user with a valid idempotency key can receive a cached
> >> > > response
> >> > > > > > > > >> for a resource they are not authorized to access. That
> is
> >> > the
> >> > > > case
> >> > > > > > > > >> with or without credentials in the response.
> >> > > > > > > > >>
> >> > > > > > > > >> "We will not release it until entire idempotency coding
> >> is
> >> > > > > > > > >> finished" is reassuring, but the architecture question
> >> > > remains:
> >> > > > > > > > >> can the generic filter approach support re-vending at
> >> all?
> >> > My
> >> > > > > > > > >> email explained why I believe it cannot, since it would
> >> > > require
> >> > > > > > > > >> the filter to authorize the caller, resolve storage
> >> > locations,
> >> > > > > > > > >> call cloud IAM, and manage credential caching. That is
> >> the
> >> > > > catalog
> >> > > > > > > > >> handler's job, not a filter's.
> >> > > > > > > > >>
> >> > > > > > > > >> The questions from my original email that still need
> >> > answers:
> >> > > > > > > > >>
> >> > > > > > > > >> How does the binding prevent cross-principal cache hits
> >> > > without
> >> > > > > > > > >> including the caller's identity?
> >> > > > > > > > >>
> >> > > > > > > > >> How does Phase 2 re-vend credentials from a generic
> HTTP
> >> > > filter
> >> > > > > > > > >> without reimplementing the catalog handler's
> >> authorization
> >> > and
> >> > > > > > > > >> credential vending logic?
> >> > > > > > > > >>
> >> > > > > > > > >> Has the operational overhead (3 DB round-trips per
> >> mutation
> >> > on
> >> > > > > > > > >> the catalog's connection pool) been benchmarked on any
> >> > > backend?
> >> > > > > > > > >>
> >> > > > > > > > >> I'd like to resolve these on this thread before more
> >> code is
> >> > > > > > > > >> written.
> >> > > > > > > > >>
> >> > > > > > > > >> Best,
> >> > > > > > > > >> Robert
> >> > > > > > > > >>
> >> > > > > > > > >> [1]
> >> > > > > > >
> >> > > https://github.com/apache/polaris/pull/3803#issuecomment-4180969838
> >> > > > > > > > >>
> >> > > > > > > > >> On Sun, Apr 5, 2026 at 7:55 AM huaxin gao <
> >> > > > [email protected]
> >> > > > > >
> >> > > > > > > > wrote:
> >> > > > > > > > >>
> >> > > > > > > > >> > Hi Robert,
> >> > > > > > > > >> >
> >> > > > > > > > >> > Thanks for raising this on the mailing list. I
> >> acknowledge
> >> > > the
> >> > > > > > > design
> >> > > > > > > > >> > concerns you listed.
> >> > > > > > > > >> >
> >> > > > > > > > >> > To clarify: I picked your option 2 (strip credentials
> >> and
> >> > > > > re-vend
> >> > > > > > on
> >> > > > > > > > >> > replay) you suggested in the PR review. The offline
> >> > > discussion
> >> > > > > > with
> >> > > > > > > > >> Yufei
> >> > > > > > > > >> > and Prashant was simply to verify that I picked the
> >> best
> >> > > > option
> >> > > > > > from
> >> > > > > > > > all
> >> > > > > > > > >> > the three suggestions you listed in the review.
> >> > > > > > > > >> >
> >> > > > > > > > >> > The reason I split it into two phases is that the
> >> current
> >> > PR
> >> > > > is
> >> > > > > > > > already
> >> > > > > > > > >> > around 3000 lines. Adding re-vending logic on top
> would
> >> > make
> >> > > > it
> >> > > > > > very
> >> > > > > > > > >> hard
> >> > > > > > > > >> > to review. This is a big feature and we need to do it
> >> > piece
> >> > > by
> >> > > > > > > piece.
> >> > > > > > > > We
> >> > > > > > > > >> > will not release it until entire idempotency coding
> is
> >> > > > finished.
> >> > > > > > > > >> >
> >> > > > > > > > >> > I'm happy to discuss the architecture further on this
> >> > > thread,
> >> > > > > > > whether
> >> > > > > > > > >> the
> >> > > > > > > > >> > filter approach can support re-vending, or whether we
> >> > need a
> >> > > > > > > > >> handler-level
> >> > > > > > > > >> > design. I may be slow to respond next week but I'll
> >> follow
> >> > > up
> >> > > > > > after
> >> > > > > > > > >> that.
> >> > > > > > > > >> >
> >> > > > > > > > >> > Thanks,
> >> > > > > > > > >> >
> >> > > > > > > > >> > Huaxin
> >> > > > > > > > >> >
> >> > > > > > > > >> > On Sat, Apr 4, 2026 at 9:34 AM Robert Stupp <
> >> > [email protected]
> >> > > >
> >> > > > > > wrote:
> >> > > > > > > > >> >
> >> > > > > > > > >> > > Hi all,
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > I need to raise some fundamental design concerns
> with
> >> > the
> >> > > > > > current
> >> > > > > > > > >> > > Idempotency-Key implementation in PR #3803 [1].
> Some
> >> of
> >> > > > these
> >> > > > > > were
> >> > > > > > > > >> > > raised in November 2025 [2] and were acknowledged
> [3]
> >> > but
> >> > > > > never
> >> > > > > > > > >> > > addressed in the implementation.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > I raised the caller identity concern on the
> >> > > > > > > > >> > > mailing list in November 2025 [2]. In December 2025
> >> > [4], I
> >> > > > > > > > >> > > reiterated that all existing implementations I
> could
> >> > find
> >> > > do
> >> > > > > > > > >> > > request fingerprinting including the request body,
> >> and
> >> > > > warned
> >> > > > > > that
> >> > > > > > > > >> > > the idempotency subsystem itself must not become a
> >> > source
> >> > > of
> >> > > > > > > > >> > > failures. The PR was opened in February 2026
> without
> >> > > > > addressing
> >> > > > > > > > >> > > either point.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > The credential leakage consequences surfaced during
> >> code
> >> > > > > review
> >> > > > > > > > >> > > [5]. The response was an offline decision between
> >> three
> >> > > > > > > > contributors,
> >> > > > > > > > >> > > without mailing list discussion [6]. The decision
> >> was to
> >> > > > split
> >> > > > > > > > >> > > into Phase 1 (strip credentials, which produces
> >> broken
> >> > > > > responses
> >> > > > > > > > >> > > but does not fix the authorization bypass) and
> Phase
> >> 2
> >> > > > > (re-vend
> >> > > > > > > > >> > > on replay, which would require a complete redesign
> >> and
> >> > > > > > > > >> > > reimplementation). Details below.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > I think we need to resolve these at the design
> level
> >> > > before
> >> > > > > more
> >> > > > > > > > >> > > code is written.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > I will not do further code reviews on this PR until
> >> > these
> >> > > > > design
> >> > > > > > > > >> > > concerns are resolved.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > 1. Caller identity is not part of the idempotency
> >> > binding
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > The caller's identity is not part of the binding.
> The
> >> > > > > > > > >> > > filter replays cached responses before Polaris's
> >> > > > > resource-level
> >> > > > > > > > >> > > authorization runs. Any authenticated user with a
> >> valid
> >> > > > > > > idempotency
> >> > > > > > > > >> > > key can receive a cached response for a resource
> they
> >> > are
> >> > > > not
> >> > > > > > > > >> > > authorized to access, including vended storage
> >> > > credentials.
> >> > > > > This
> >> > > > > > > > >> > > is a full authorization bypass.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > Idempotency keys are HTTP headers that end up in
> >> access
> >> > > > logs,
> >> > > > > > > > >> > > observability tooling, debug output. They cannot
> be a
> >> > > > security
> >> > > > > > > > >> > > boundary. I described the attack vector in my PR
> >> review
> >> > > [5].
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > The proposed Phase 2 [6] says it will "re-authorize
> >> the
> >> > > user
> >> > > > > and
> >> > > > > > > > >> > > re-vend fresh credentials on replay." That
> addresses
> >> > > > > credentials
> >> > > > > > > > >> > > but not the binding itself.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > 2. Credential stripping breaks Polaris's deployment
> >> > model
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > Without mailing list discussion, the decision was
> >> made
> >> > > > offline
> >> > > > > > > > >> > > to strip credentials before storing ("Phase 1")
> [6].
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > Polaris's purpose is to vend down-scoped
> credentials.
> >> > > > Clients
> >> > > > > > have
> >> > > > > > > > >> > > no ambient storage access.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > A stripped response returns a successful response
> >> > ("table
> >> > > > > > > created")
> >> > > > > > > > >> > > to the client, which then fails hard because it
> >> cannot
> >> > > > access
> >> > > > > > the
> >> > > > > > > > >> > > object storage. Query engines (Spark, Trino, etc.)
> >> will
> >> > > fail
> >> > > > > on
> >> > > > > > > > >> > > the first storage operation against that table.
> Query
> >> > > engine
> >> > > > > > users
> >> > > > > > > > >> > > have no way to resolve this, they don't control the
> >> REST
> >> > > > > catalog
> >> > > > > > > > >> > > interaction. For stage-create, the client gets a
> >> > > successful
> >> > > > > > > > >> > > response telling it to write data but has no
> >> credentials
> >> > > to
> >> > > > > > write
> >> > > > > > > > >> > > with. No fallback exists.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > The stripped response is spec-compliant, but
> >> > > spec-compliant
> >> > > > > and
> >> > > > > > > > >> > > operationally correct are not the same thing.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > 3. Phase 2 is architecturally incompatible with
> >> Phase 1
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > The filter is a generic HTTP filter below the
> >> > application
> >> > > > > layer.
> >> > > > > > > It
> >> > > > > > > > >> > > treats responses as opaque JSON. Credential vending
> >> > > happens
> >> > > > > deep
> >> > > > > > > > >> > > inside IcebergCatalogHandler, scoped to specific
> >> table
> >> > > > > locations
> >> > > > > > > > >> > > and principals, through cloud IAM calls (STS
> >> AssumeRole,
> >> > > GCS
> >> > > > > > token
> >> > > > > > > > >> > > exchange).
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > Re-vending on replay would require the filter to
> >> resolve
> >> > > the
> >> > > > > > > > >> > > table's storage location, authorize the caller,
> call
> >> > cloud
> >> > > > IAM
> >> > > > > > to
> >> > > > > > > > >> > > mint scoped credentials, and manage credential
> >> caching.
> >> > > That
> >> > > > > is
> >> > > > > > > > >> > > not a filter, that is reimplementing the catalog
> >> > handler's
> >> > > > > > > > >> > > authorization and credential vending logic.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > Phase 1 and Phase 2 need to be designed and shipped
> >> > > > together.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > 4. Operational overhead
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > Every idempotent mutation adds 3 database
> round-trips
> >> > > > (INSERT,
> >> > > > > > > > UPDATE,
> >> > > > > > > > >> > > heartbeat UPDATEs), each borrowing its own
> connection
> >> > > > > > > > >> > > (autoCommit=true, no reuse), sharing the catalog's
> >> > > > connection
> >> > > > > > pool
> >> > > > > > > > >> > > and RDBMS. This added latency increases the
> >> probability
> >> > of
> >> > > > > > client
> >> > > > > > > > >> > > timeouts, which are exactly the transient failures
> >> that
> >> > > > > > > idempotency
> >> > > > > > > > >> > > is supposed to prevent.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > The idempotency table is high-churn and write-hot,
> >> > > competing
> >> > > > > > with
> >> > > > > > > > >> > > catalog operations for connection pool capacity.
> The
> >> > > > overhead
> >> > > > > > gets
> >> > > > > > > > >> > > worse on managed and distributed database backends.
> >> > > > Duplicate
> >> > > > > > > > >> > > requests trigger polling loops with repeated
> database
> >> > > reads.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > No benchmark data has been presented for any
> backend.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > What needs to change
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > The current generic-filter approach cannot support
> >> these
> >> > > > > > > > >> > > requirements. This needs a different design.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > A new design must address the following:
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > a) The idempotency binding MUST include the
> caller's
> >> > > > principal
> >> > > > > > > > >> > >    identity.
> >> > > > > > > > >> > > b) Credential-bearing responses must not be stored
> >> > as-is.
> >> > > On
> >> > > > > > > > >> > >    replay, the handler must build a fresh response
> >> with
> >> > > > fresh
> >> > > > > > > > >> > >    credentials for the current caller.
> >> > > > > > > > >> > > c) The operational overhead needs to be
> benchmarked.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > I am happy to review code once the design addresses
> >> > these
> >> > > > > > points.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > Best,
> >> > > > > > > > >> > > Robert
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > [1] https://github.com/apache/polaris/pull/3803
> >> > > > > > > > >> > > [2]
> >> > > > > > > >
> >> > https://lists.apache.org/thread/qqrgr2bzsp69629jdj8kf39m10pzwy6l
> >> > > > > > > > >> > >     (Nov 24, 2025: "Building a service-internal key
> >> from
> >> > > the
> >> > > > > > > client
> >> > > > > > > > >> > >     provided idempotency key plus more data from
> the
> >> > > request
> >> > > > > > gives
> >> > > > > > > > >> > >     better uniqueness. That more data can come
> from:
> >> > > > operation
> >> > > > > > id,
> >> > > > > > > > >> > >     all operation parameters, including the whole
> >> > payload,
> >> > > > > > quite a
> >> > > > > > > > >> > >     few HTTP request headers like user-agent,
> >> > > authorization,
> >> > > > > > host,
> >> > > > > > > > >> > >     accept-* and also the client's address.")
> >> > > > > > > > >> > > [3]
> >> > > > > > > >
> >> > https://lists.apache.org/thread/7vy44mxsq2cgtp8gsj0r8pk8blp37h5b
> >> > > > > > > > >> > >     (Dec 8, 2025: "I agree that a bare
> >> Idempotency-Key +
> >> > > > > entity
> >> > > > > > > > >> > >     identifier is not enough to protect against
> >> buggy or
> >> > > > > > malicious
> >> > > > > > > > >> > >     clients; fingerprinting the full request would
> be
> >> > > > > > stronger.")
> >> > > > > > > > >> > > [4]
> >> > > > > > > >
> >> > https://lists.apache.org/thread/jt3gmpnjh6z490dvxorjs6ms00kvo264
> >> > > > > > > > >> > >     (Dec 8, 2025: "We should avoid failing Polaris
> if
> >> > this
> >> > > > > > > subsystem
> >> > > > > > > > >> > >     fails, or letting this subsystem be a reason
> for
> >> its
> >> > > > > > existence
> >> > > > > > > > >> > >     (aka retry due to timeouts because the
> >> > > > idempotent-request
> >> > > > > > > > >> > >     subsystem hangs).")
> >> > > > > > > > >> > > [5]
> >> > > > > > > > >> > >
> >> > > > > > > > >>
> >> > > > > > > >
> >> > > > > >
> >> > > >
> >> >
> >>
> https://github.com/apache/polaris/pull/3803#pullrequestreview-4051690055
> >> > > > > > > > >> > > [6]
> >> > > > > > > > >>
> >> > > > >
> >> https://github.com/apache/polaris/pull/3803#issuecomment-4180969838
> >> > > > > > > > >> > >
> >> > > > > > > > >> >
> >> > > > > > > > >>
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>

Reply via email to