Hi all,

supporting idempotency is a great enhancement for Polaris!

Being able to see the whole architecture and code design would help a lot.
The best way to achieve this and iterate on the overall approach is to
have a (draft) PR containing at least enough to let everybody perform
tests against it and inspect the solution top-down - or even have the
whole thing in one PR and later split it into "reviewable" smaller
PRs.

Robert

On Mon, Dec 15, 2025 at 8:04 PM huaxin gao <[email protected]> wrote:
>
> Hi all,
>
> Thanks again for the detailed feedback on the idempotency design. I've gone
> through all the comments and updated the design doc accordingly. If you see
> any remaining gaps or have further questions, please let me know. If there
> are no further comments, I'll resume the implementation work based on the
> revised document.
>
> Best,
> Huaxin
>
> On Tue, Dec 9, 2025 at 11:15 PM huaxin gao <[email protected]> wrote:
>
> > Hi Robert,
> >
> >
> > Quick follow‑up to my mail from yesterday: I’ve just updated the proposal
> > text to incorporate your comments. In particular:
> >
> >
> > I clarified the finalization rules so the server MUST NOT return 2xx if
> > commit/update preconditions (expected base snapshot, requested schema
> > changes, etc.) are not satisfied; in those cases the handler returns an
> > appropriate 4xx, and that 4xx is what the idempotency layer finalizes and
> > replays.
> >
> >
> > I added an explicit replay‑failure path: if a previously finalized result
> > can no longer be reproduced (e.g., table dropped and metadata purged), the
> > server returns a 5xx with subtype idempotency_replay_failed and does not
> > try to re‑run the old mutation.
> >
> >
> > Under Multi‑node Coordination, I wrote down the stale‑lease/reconciliation
> > behavior more concretely and noted that pod restarts/crashes show up as
> > missing heartbeats; duplicates then see a stale lease, run reconciliation
> > once, and either return the original result or a 503 rather than waiting
> > indefinitely.
> >
> >
> > I added a short Failure modes note after the IdempotencyStore SPI
> > describing how we handle a pluggable backend that is down/slow:
> > coordination‑critical paths fail fast with a defined 5xx, while
> > heartbeat/finalize are best‑effort so Polaris itself doesn’t get stuck.
> >
> >
> > I also tightened the Non‑Goals section to state explicitly that this
> > iteration only targets the built‑in Polaris Iceberg REST catalog, not
> > federated or non‑IRC APIs.
> >
> >
> > Happy to tweak the wording further if you think any of these areas still
> > need more precision.
> >
> >
> > Best,
> >
> > Huaxin
> >
> >
> > On Mon, Dec 8, 2025 at 8:50 PM huaxin gao <[email protected]> wrote:
> >
> >> Hi Robert,Thanks a lot for taking the time to write such a detailed note
> >> — I really appreciate the careful review and the references back to the
> >> Iceberg spec. I agree this cuts across API, persistence, and
> >> distributed‑systems concerns, so we need to get the design right before we
> >> treat it as “done”.On a few of your main points:
> >>
> >>    - Scope & Iceberg semantics
> >>
> >> The “key‑only semantics” phrase is my shorthand, not Iceberg wording.
> >> What I meant is: in the earlier Iceberg mailing‑list discussion we
> >> converged on baseline key‑only idempotency aimed at low‑level/network
> >> retries, with no payload fingerprinting in the protocol. Servers treat
> >> Idempotency-Key as an opaque token bound to a single
> >> operation/resource/realm; if we ever explore payload‑binding, that would be
> >> a separate follow‑up discussion, not part of the current design.
> >>
> >>
> >>    - Key vs fingerprinting
> >>
> >> I agree that a bare Idempotency-Key + entity identifier is not enough to
> >> protect against buggy or malicious clients; fingerprinting the full request
> >> would be stronger. For v1 I was trying to stay aligned with the Iceberg
> >> REST spec (client‑supplied key, no payload fingerprint in the contract) and
> >> keep the server implementation simple, but I’ll add a section that:
> >>
> >>
> >>    - calls out the risk you describe (two different logical requests
> >>       reusing the same key),
> >>       - spells out the binding we do enforce (key + operation type +
> >>       resource + realm), and
> >>       - treats request‑fingerprinting as a possible follow‑on
> >>       enhancement rather than something we’re silently ignoring.
> >>    - Multi‑pod, liveness and “followers waiting forever”
> >>
> >> I’ve expanded the design doc to describe the heartbeat/lease mechanism
> >> and the behavior when the primary pod dies. In short: we don’t let
> >> followers wait unboundedly. Each owner periodically calls updateHeartbeat,
> >> and duplicates only wait while now − heartbeat_at is within a short
> >> lease window; once that lease expires, we hand control to a reconciliation
> >> step rather than continuing to block. I’ll make sure this algorithm and its
> >> failure modes (including timeouts/back‑pressure limits) are written down
> >> more rigorously, not just implied in the text.
> >>
> >>
> >>    - Backend pluggability and failure
> >>
> >> I agree that if the idempotency backend is pluggable, the design has to
> >> cover the backend down / nuked explicitly so Polaris doesn’t just hang. I
> >> will add a note in the design doc if the idempotency backend is unavailable
> >> we must fail requests in a bounded way (not hang), and treat
> >> hartbeat/finalize as best-effort so Polaris doesn't get stuck.
> >>
> >>
> >>    - Quarkus collaboration and scope
> >>
> >> I like the idea of collaborating with the Quarkus community on a more
> >> generic JAX‑RS idempotency layer, and I agree there’s nothing inherently
> >> “Polaris‑only” about many of these concerns. For the moment I’d still like
> >> to keep this proposal scoped to the Polaris REST catalog (IRC) so we can
> >> converge on concrete semantics there first, but I’ll add a short “future
> >> work” section that talks about factoring out the generic pieces and
> >> exploring Quarkus integration once we have agreement on the core behavior.
> >>
> >> Best,
> >> Huaxin
> >>
> >>
> >>
> >>
> >> On Mon, Dec 8, 2025 at 2:23 AM Robert Stupp <[email protected]> wrote:
> >>
> >>> Hi,
> >>>
> >>> > Spec alignment: Iceberg chose key‑only semantics (no payload
> >>> fingerprinting)
> >>>
> >>> I do not see this "key-only semantics" mentioned anywhere in the
> >>> Iceberg spec [1].
> >>> The Iceberg spec requirement "The idempotency key must be globally
> >>> unique" [1] OTOH is impossible for a client to guarantee.
> >>>
> >>> It is "relatively easy" for clients to implement some retry mechanism
> >>> and add some HTTP header (it is probably also not easy for clients to
> >>> implement properly, see all the issues that happened in the past wrt
> >>> when to (not) throw a CommitFailedException).
> >>> It is definitely a complex task for servers.
> >>>
> >>> This feature touches many application, HTTP, security and distributed
> >>> systems aspects and we should be very careful.
> >>> I'd like to repeat my proposal to collaborate with the Quarkus
> >>> community, because they have extensive knowledge about all these
> >>> things and are very open to collaboration. I do not see any
> >>> "specialties" that are unique to Polaris and force us to come up with
> >>> our very own implementation.
> >>>
> >>> In any case, we should first design this functionality very carefully.
> >>> Consider all use cases, the potential logical and technical states,
> >>> exceptions, race conditions and failure scenarios.
> >>> After we have consensus on all that, we can move on to the code.
> >>>
> >>> Some comments around the design and the feature itself:
> >>> * The multi-table-commit endpoint doesn't seem to fit into the design
> >>> (many "resources" not one)?
> >>> * A resource has been deleted before the current state can be served
> >>> ("delete" succeeds before the "follower of an update" finishes). The
> >>> idempotent-request code would yield "serve this table-metadata" - but
> >>> it cannot, as the metadata has been purged.
> >>> * Two "idempotent requests" yielding different results, racing with
> >>> another. While that's mentioned in the Iceberg spec as "response body
> >>> *may* reflect a newer state", I am not convinced this is what all
> >>> clients can cope with. For example, a client that adds a column to the
> >>> schema expects that column to be present upon successful request
> >>> completion. But the "response body may reflect a newer state"
> >>> exception means that an "add column" operation can legitably yield a
> >>> schema without the added column. Similar for column type changes,
> >>> column removals, extending to sort-orders and partition-specs and lots
> >>> more. This can lead to subtle issues in all clients and query engines.
> >>> Isn't it a server-bug yielding a success-response to an Iceberg
> >>> update-table request with non-fulfillable update-request-requirements?
> >>> * "Sole idempotency-key + entity-identifier" is not enough. Two
> >>> identical idempotency keys, either intentional or due to a bug, would
> >>> lead to wrong responses. This would _not_ be a problem with
> >>> fingerprinting _all_ inputs for the operation. All existing
> >>> implementations and experience reports/posts that I could find do
> >>> request fingerprinting, including the request body.
> >>> * It is unclear whether this feature is intended for the "built in"
> >>> Polaris Iceberg catalog or does it include federated catalogs? Is it
> >>> useful for non-IRC APIs?
> >>>
> >>> Some comments around the technical things. All these could be
> >>> "offloaded" to Quarkus, leveraging async event loop processing and
> >>> circuit breaking:
> >>> * If a pod executing the "primary" request dies, do "followers" wait
> >>> "forever"? What is the actual distributed algorithm to resolve this
> >>> without having a lot of threads spinning for a long time? This can
> >>> happen for buggy clients, bad client configurations or intentionally
> >>> bad clients.
> >>> * Technical failure and rolling-restart/upgrade scenarios should be
> >>> considered.
> >>> * As the "idempotent request coordination backend" is pluggable, the
> >>> design should also consider the case that the backend state is nuked,
> >>> becomes unresponsive or fails in the meantime. We should avoid failing
> >>> Polaris if this subsystem fails, or letting this subsystem be a reason
> >>> for its existence (aka retry due to timeouts because the
> >>> idempotent-request subsystem hangs).
> >>>
> >>> Robert
> >>>
> >>> [1]
> >>> https://github.com/apache/iceberg/blob/19b4bd024486d9d516d0e547e273419c1bc7074e/open-api/rest-catalog-open-api.yaml#L1933-L1961
> >>>
> >>> On Wed, Nov 26, 2025 at 7:08 PM huaxin gao <[email protected]>
> >>> wrote:
> >>> >
> >>> > Thanks Robert for the thoughtful note!
> >>> >
> >>> > A generic JAX‑RS/Quarkus idempotency layer would be useful broadly, and
> >>> > Quarkus’s distributed cache is a good building block. For Polaris,
> >>> though,
> >>> > we need a few things that go beyond caching or generic locking:
> >>> >
> >>> >
> >>> >    - No external lock service: we use an atomic “first‑writer‑wins”
> >>> reserve
> >>> >    via a unique key in durable storage (single upsert), so exactly one
> >>> node
> >>> >    owns a key; others see the existing row.
> >>> >    - Spec alignment: Iceberg chose key‑only semantics (no payload
> >>> >    fingerprinting). Safety comes from first‑acceptance plus binding
> >>> >    {operationType, resourceId, realm}; mismatched reuse -> 422;
> >>> duplicates do
> >>> >    not re‑execute.
> >>> >    - Liveness and failover: heartbeat/lease while IN_PROGRESS and
> >>> >    reconciliation on stale owners (finalize‑gap/takeover) so
> >>> duplicates don’t
> >>> >    block indefinitely and we avoid double execution.
> >>> >    - Durable replay: persist a minimal, equivalent response (not
> >>> >    in‑memory/TTL cache) and a clear status policy (finalize only
> >>> 2xx/terminal
> >>> >    4xx; never 5xx).
> >>> >
> >>> > Phase I: I’ll focus on implementing this in Polaris behind a small
> >>> > storage‑agnostic SPI and wiring the flows.
> >>> >
> >>> > Phase II: we can revisit extracting the core into a reusable
> >>> JAX‑RS/Quarkus
> >>> > module, but for now I’d like to keep the scope on shipping Polaris v1.
> >>> > Thanks,
> >>> > Huaxin
> >>> >
> >>> > On Tue, Nov 25, 2025 at 11:18 PM Robert Stupp <[email protected]> wrote:
> >>> >
> >>> > > Hi all,
> >>> > >
> >>> > > To build an idempotent service, it seems necessary to consider some
> >>> > > things, naming a few:
> >>> > > * distributed locking, resilient to failure scenarios
> >>> > > * distributed caching
> >>> > > * request fingerprinting
> >>> > > * request failure scenarios
> >>> > >
> >>> > > I think a generic JAX-RS idempotency functionality would be
> >>> beneficial
> >>> > > not just for Polaris.
> >>> > > I can imagine that the Quarkus project would be very interested in
> >>> > > such a thing. For example, Quarkus already has functionality for
> >>> > > distributed caching in place, which is a building block for
> >>> idempotent
> >>> > > responses.
> >>> > > Have we considered joining forces with them and leveraging synergies?
> >>> > >
> >>> > > Robert
> >>> > >
> >>> > > On Wed, Nov 26, 2025 at 4:57 AM huaxin gao <[email protected]>
> >>> wrote:
> >>> > > >
> >>> > > > Hi Dmitri,
> >>> > > >
> >>> > > > Thanks for the reply and the detailed comments in the proposal.
> >>> You’re
> >>> > > > right: the goal is to implement the recently approved Iceberg
> >>> > > > Idempotency-Key spec, and we don’t plan any additional REST
> >>> Catalog API
> >>> > > > changes in Polaris. I’ve refocused the proposal on the server-side
> >>> > > > implementation and agree we should land the REST Catalog work
> >>> first, then
> >>> > > > extend to the Management API.
> >>> > > >
> >>> > > > I addressed your inline comments and added a small,
> >>> backend-agnostic
> >>> > > > Idempotency Persistence API
> >>> (reserve/load/heartbeat/finalize/purge) so it
> >>> > > > works across all storage backends (Postgres first).
> >>> > > >
> >>> > > > On the async tasks framework: agreed — there are synergies. I’ll
> >>> keep
> >>> > > this
> >>> > > > in mind and align the idempotency store semantics with the async
> >>> tasks
> >>> > > > model.
> >>> > > > Best,
> >>> > > > Huaxin
> >>> > > >
> >>> > > > On Tue, Nov 25, 2025 at 12:21 PM Dmitri Bourlatchkov <
> >>> [email protected]>
> >>> > > > wrote:
> >>> > > >
> >>> > > > > Hi Huaxin,
> >>> > > > >
> >>> > > > > Thanks for resuming this proposal!
> >>> > > > >
> >>> > > > > In general, I suppose the intention is to implement the recently
> >>> > > approved
> >>> > > > > Iceberg REST Catalog spec change for Idempotency Keys. With that
> >>> in
> >>> > > mind, I
> >>> > > > > believe the Polaris proposal probably needs to be more focused
> >>> on the
> >>> > > > > server side implementation now that the API spec has been
> >>> finalized. I
> >>> > > do
> >>> > > > > not think Polaris needs any other API changes in the REST
> >>> Catalog on
> >>> > > top of
> >>> > > > > the Iceberg spec.
> >>> > > > >
> >>> > > > > I'd propose to deal with the REST Catalog API first and then
> >>> extend to
> >>> > > the
> >>> > > > > Management API (for the sake of simplicity).
> >>> > > > >
> >>> > > > > I added some more specific comments in the doc, but overall, I
> >>> believe
> >>> > > we
> >>> > > > > need to consider what needs to be changed in the java
> >>> Persistence API
> >>> > > in
> >>> > > > > Polaris because the idempotency feature probably applies to all
> >>> > > backends.
> >>> > > > >
> >>> > > > > Also, as I commented [1] in earlier emails about this proposal, I
> >>> > > believe
> >>> > > > > some synergies can be found with the async tasks framework [2].
> >>> The
> >>> > > main
> >>> > > > > point here is orchestrating request execution among a set of
> >>> > > distributed
> >>> > > > > server nodes.
> >>> > > > >
> >>> > > > > [1]
> >>> https://lists.apache.org/thread/28hx9kl4qmm5sho8jxmjlt6t0cd0hn6d
> >>> > > > >
> >>> > > > > [2]
> >>> https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l
> >>> > > > >
> >>> > > > > Cheers,
> >>> > > > > Dmitri.
> >>> > > > >
> >>> > > > >
> >>> > > > > On Sat, Nov 22, 2025 at 7:50 PM huaxin gao <
> >>> [email protected]>
> >>> > > wrote:
> >>> > > > >
> >>> > > > > > Hi all,
> >>> > > > > > I would like to restart the discussion on Idempotency-Key
> >>> support in
> >>> > > > > > Polaris. This proposal focuses on Polaris server-side behavior
> >>> and
> >>> > > > > > implementation details, with the Iceberg spec as the baseline
> >>> API
> >>> > > > > contract.
> >>> > > > > > Thanks for your review and feedback.
> >>> > > > > >
> >>> > > > > > Polaris Idempotency Key Proposal
> >>> > > > > > <
> >>> > > > > >
> >>> > > > >
> >>> > >
> >>> https://docs.google.com/document/d/1ToMMziFIa7DNJ6CxR5RSEg1dgJSS1zFzZfbngDz-EeU/edit?tab=t.0#heading=h.ecn4cggb6uy7
> >>> > > > > > >
> >>> > > > > >
> >>> > > > > > Iceberg Idempotency Key Proposal
> >>> > > > > > <
> >>> > > > > >
> >>> > > > >
> >>> > >
> >>> https://docs.google.com/document/d/1WyiIk08JRe8AjWh63txIP4i2xcIUHYQWFrF_1CCS3uw/edit?tab=t.0#heading=h.jfecktgonj1i
> >>> > > > > > >
> >>> > > > > >
> >>> > > > > > Best,
> >>> > > > > > Huaxin
> >>> > > > > >
> >>> > > > >
> >>> > >
> >>>
> >>

Reply via email to