Hi Hoaxin, Thanks a lot for the proposal!
Having idempotency in Polaris would be really great! I wanted to comment on the doc, but commenting doesn't seem possible, so I'm just asking here. IIUC idempotency is about giving a client a chance to send the (exact) same "committing" request again, in case of intermittent technical failures, to Polaris and get the exact same result as for the original request. I couldn't find anything about repeating the HTTP response in the proposal and the example table schema does only seem to contain the HTTP status code. Could you add some details about how the response for a repeated request would be handled? I stumbled upon the phrase "Keys (UUIDv7) are globally unique". That's not quite correct. While UUID v7 contains a 48 bit timestamp, 12-42 bit from a randomly seeded counter, the rest is filled with randomness. That, assuming a proper implementation, gives a pretty good value. But UUIDv7 does not guarantee any uniqueness, it rather recommends a good randomness, but that's not the same as "globally unique". Bad PRNGs, buggy PRNGs, bad/buggy UUIDv7 generators, buggy clients reusing the idempotency key of a previous and different request, and last but not least malicious people can provoke collisions. Just want to warn that assuming that UUIDv7 and operation/resource is enough to distinguish all requests can be a trap. Building a service-internal key from the client provided idempotency key plus more data from the request gives better uniqueness. That "more data" can come from: operation id, all operation parameters, including the whole payload, quite a few HTTP request headers like user-agent, authorization, host, accept-* and also the client's address. The doc mentions that the idempotency code would be implemented as a HTTP filter/interceptor. I wonder whether we do have all information for authorization checks at that point available. We can run Polaris in k8s backed by more than one pod. What are your thoughts about the case when the 1st request reaches "pod 1" and a 2nd request reaches "pod 2"? And what would happen if "pod 1" dies? The doc could be more precise about which HTTP status codes can be cached and which must never be cached. For example, instead of 5xx, it should mention the exact codes, as not all 5xx codes can be safely retried at an arbitrary pace (rate limiting responses). I guess we have to go through each status code individually. What kind of telemetry (traces, metrics) and to which level of detail do we want for our users? I wonder whether there are existing alternatives. Are there maybe existing middleware components (OSS or SaaS) that already offer what we need? Best, Robert On Sun, Nov 23, 2025 at 1:50 AM huaxin gao <[email protected]> wrote: > > Hi all, > I would like to restart the discussion on Idempotency-Key support in > Polaris. This proposal focuses on Polaris server-side behavior and > implementation details, with the Iceberg spec as the baseline API contract. > Thanks for your review and feedback. > > Polaris Idempotency Key Proposal > <https://docs.google.com/document/d/1ToMMziFIa7DNJ6CxR5RSEg1dgJSS1zFzZfbngDz-EeU/edit?tab=t.0#heading=h.ecn4cggb6uy7> > > Iceberg Idempotency Key Proposal > <https://docs.google.com/document/d/1WyiIk08JRe8AjWh63txIP4i2xcIUHYQWFrF_1CCS3uw/edit?tab=t.0#heading=h.jfecktgonj1i> > > Best, > Huaxin
