I am fairly strongly opposed to repurposing the timestamp field for this. To move forward, I'd recommend working on catalog sequence numbers.
On Sat, Nov 8, 2025 at 6:54 PM Dov Alperin <[email protected]> wrote: > Hi Iceberg community! > (I initially opened this message as it's own thread in error, sorry about > that) > I’m curious where this proposal landed? I work at Materialize > <http://materialize.com/> and we are keenly interested both in seeing this > proposal come to fruition but possibly also helping to implement it. > > I see there was a call in May, but I’m not sure what the conclusion was. As > spec v4 nears closer, I am curious which of the two proposals the community > favors here? > > Best, > Dov > > On Tue, May 27, 2025 at 01:09:05AM -0700, Maninderjit Singh wrote: > > Forgot to attach a link to the update proposal > > < > https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0#heading=h.ypbwvr181qn4 > > > > . > > > > On Tue, May 27, 2025 at 1:06 AM Maninderjit Singh < > > [email protected]> wrote: > > > > > Hi community, > > > > > > I have updated the proposal with both the options (overwriting > existing > > > timestamps-ms vs introducing a new sequence/timestamp field) as we have > > > initial consensus on using catalog authored sequence/timestamp. > Jagdeep, > > > please review to ensure that the options are correctly captured. I have > > > also added additional arguments on why we can't assume timestamp to be > > > "informational" since it's being used in critical paths and > > > incorrect values can take the table offline. > > > > > > Also, I'm moving the meeting to Thursday to better accommodate > conflicts. > > > I would also record the meeting in case anyone misses and is > interested in > > > the discussion. > > > > > > Sync for iceberg multi-table transactions > > > Thursday, May 29 · 9:00 – 10:00am > > > Time zone: America/Los_Angeles > > > Google Meet joining info > > > Video call link: https://meet.google.com/ffc-ttjs-vti > > > > > > Thanks, > > > Maninder > > > > > > > > > > > > On Mon, May 26, 2025 at 12:47 AM Péter Váry < > [email protected]> > > > wrote: > > > > > >> I'm interested, but can't be there, but please record the meeting. > > >> Thanks, > > >> Peter > > >> > > >> Maninderjit Singh <[email protected]> ezt írta (időpont: > > >> 2025. máj. 24., Szo, 2:30): > > >> > > >>> Hi dev community, > > >>> I was wondering if we could join a call next week for discussing the > > >>> multi-table transactions so we can make progress. I have shared a > meeting > > >>> invite where anyone who's interested in the discussion can join. > Please let > > >>> me know if this works. > > >>> > > >>> Thanks, > > >>> Maninder > > >>> > > >>> Sync for iceberg multi-table transactions > > >>> Friday, May 30 · 9:00 – 10:00am > > >>> Time zone: America/Los_Angeles > > >>> Google Meet joining info > > >>> Video call link: https://meet.google.com/ffc-ttjs-vti > > >>> > > >>> > > >>> On Wed, May 21, 2025 at 10:25 AM Maninderjit Singh < > > >>> [email protected]> wrote: > > >>> > > >>>> Hi dev community, > > >>>> Following up on the thread here to continue the discussion and get > > >>>> feedback since we couldn't get to it in sync. I think we have made > some > > >>>> progress in the discussion that I want to capture while > highlighting the > > >>>> items where we need to create consensus along with pros and cons. I > would > > >>>> need help to add clarity and to make sure the arguments are captured > > >>>> correctly. > > >>>> > > >>>> *Things we agree on* > > >>>> > > >>>> 1. Don't maintain server side state for tracking the > transactions. > > >>>> 2. Need global (catalog-wide) ordering of snapshots via some > > >>>> (hybrid/logical) clock/CSN > > >>>> 3. Optionally expose the catalog's clock/CSN information without > > >>>> changing how tables load > > >>>> 4. Loading consistent snapshot across multiple tables and > > >>>> repeatable reads based on the reference clock/CSN > > >>>> > > >>>> > > >>>> *Things we disagree on* > > >>>> > > >>>> 1. Reuse existing timestamp field vs introduce a new field CSN > > >>>> > > >>>> > > >>>> *Reusing timestamp field approach* > > >>>> > > >>>> - Pros: > > >>>> > > >>>> > > >>>> 1. Backwards compatibility, no change to table metadata spec so > > >>>> could be used by existing v2 tables. > > >>>> 2. Fixes existing time travel and ordering issues > > >>>> 3. Simplifies and clarifies the spec (no new id for snapshots) > > >>>> 4. Common notion of timestamp that could be used to evaluate > causal > > >>>> relationships in other proposals like events or commit reports. > > >>>> > > >>>> > > >>>> - Cons > > >>>> > > >>>> > > >>>> 1. Unique timestamp generation in milliseconds. Potential > > >>>> mitigations: > > >>>> > https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&disco=AAABjwaxXeg > > >>>> 2. Concerns about client side timestamp being overridden. > > >>>> > > >>>> *Adding new CSN field* > > >>>> > > >>>> - Pros: > > >>>> > > >>>> > > >>>> 1. Flexibility to use logical or hybrid clocks. Not sure how > > >>>> clients can generate a hybrid clock timestamp here without > suffering from > > >>>> clock skew (Would be good to clarify this)? > > >>>> 2. No client side overriding concerns. > > >>>> > > >>>> > > >>>> - Cons: > > >>>> > > >>>> > > >>>> 1. Not backwards compatible, requires new field in table metadata > > >>>> so need to wait for v4 > > >>>> 2. Does not fix time travel and snapshot-log ordering issues > > >>>> 3. Adds another id for snapshots that clients need to generate > and > > >>>> reason about. > > >>>> 4. Could not be extended to use in other proposals for causal > > >>>> reasoning. > > >>>> > > >>>> > > >>>> Thanks, > > >>>> Maninder > > >>>> > > >>>> On Tue, May 20, 2025 at 8:16 PM Maninderjit Singh < > > >>>> [email protected]> wrote: > > >>>> > > >>>>> Appreciate the feedback on the "catalog-authored timestamp" > document > > >>>>> < > https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0 > > > > >>>>> ! > > >>>>> > > >>>>> Ryan, I don't think we can get consistent time travel queries in > > >>>>> iceberg without fixing the timestamp field since it's what the spec > > >>>>> <https://iceberg.apache.org/spec/#point-in-time-reads-time-travel> > > >>>>> prescribes for time travel. Hence I took the liberty to re-use it > for the > > >>>>> catalog timestamp which ensures that snapshot-log is correctly > ordered for > > >>>>> time travel. Additionally, the timestamp field needs to be fixed > to avoid > > >>>>> breaking commits to the table due to accidental large skews as per > current > > >>>>> spec, the scenario is described in detail here > > >>>>> < > https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0#bookmark=id.6avx66vzo168 > > > > >>>>> . > > >>>>> The other benefit of reusing the timestamp field is spec simplicity > > >>>>> and clarity on timestamp generation responsibilities without > requiring the > > >>>>> need to manage yet another identifier (in addition to sequence > number, > > >>>>> snapshot id and timestamp) for snapshots. > > >>>>> > > >>>>> Jagdeep, your concerns about overriding the timestamp field are > valid > > >>>>> but the reason I'm not too worried about it is because client > can't assume > > >>>>> a commit is successful without their response being acknowledged > by the > > >>>>> catalog which returns the CommitTableResponse > > >>>>> < > https://github.com/apache/iceberg/blob/c2478968e65368c61799d8ca4b89506a61ca3e7c/open-api/rest-catalog-open-api.yaml#L3997> > with > > >>>>> new metadata (that has catalog authored timestamps in the > proposal). I'm > > >>>>> happy to work with you to put something common together and get > the best > > >>>>> out of the proposals. > > >>>>> > > >>>>> Thanks, > > >>>>> Maninder > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> On Tue, May 20, 2025 at 5:48 PM Jagdeep Sidhu < > [email protected]> > > >>>>> wrote: > > >>>>> > > >>>>>> Thank you Ryan, Maninder and the rest of the community for > feedback > > >>>>>> and ideas! > > >>>>>> Drew and I will take another pass and remove the catalog > > >>>>>> co-ordination requirement for LoadTable API, and bring the > proposal closer > > >>>>>> to "catalog-authored timestamp" in the sense that clients can use > CSN to > > >>>>>> find the right snapshot, but still leave upto Catalog on what it > want to > > >>>>>> use for CSN (Hybrid clock timestamp or another monotonically > increasing > > >>>>>> number). > > >>>>>> > > >>>>>> If more folks have feedback, please leave it in the doc or email > > >>>>>> list, so we can address it as well in the document update. > > >>>>>> > > >>>>>> Maninder, one reason we proposed a new field for > CommitSequenceNumber > > >>>>>> instead of using an existing field is for backwards > compatibility. Catalogs > > >>>>>> can start optionally exposing the new field, and interested > clients can use > > >>>>>> the new field, but existing clients keep working as is. Existing > and new > > >>>>>> clients can also keep working as is against the same tables in the > > >>>>>> same Catalog. My one worry is that having Catalog override the > timestamp > > >>>>>> field for commits may break some existing clients? Today all > Iceberg > > >>>>>> engines/clients do not expect the timestamp field in > metadata/snapshot-log > > >>>>>> to be overwritten by the Catalog. > > >>>>>> > > >>>>>> How do you feel about taking the best from each proposal?, i.e. > > >>>>>> monotonically increasing commit sequence numbers (some catalogs > can use > > >>>>>> timestamps, some can use logical clock but we don't have to > enforce it - > > >>>>>> leave it up to Catalog), but keep client side logic for resolving > the right > > >>>>>> snapshot using sequence numbers instead of adding that > functionality to > > >>>>>> Catalog. Let me know! > > >>>>>> > > >>>>>> Thank you! > > >>>>>> -Jagdeep > > >>>>>> > > >>>>>> On Tue, May 20, 2025 at 2:45 PM Ryan Blue <[email protected]> > wrote: > > >>>>>> > > >>>>>>> Thanks for the proposals! There are things that I think are good > > >>>>>>> about both of them. I think that the catalog-authored timestamps > proposal > > >>>>>>> misunderstands the purpose of the timestamp field, but does get > right that > > >>>>>>> a monotonically increasing "time" field (really a sequence > number) across > > >>>>>>> tables enables the coordination needed for snapshot isolated > reads. I like > > >>>>>>> that the sequence number proposal leaves the meaning of the > field to the > > >>>>>>> catalog for coordination, but it still proposes catalog > coordination by > > >>>>>>> loading tables "at" some sequence number. Ideally, we would be > able to > > >>>>>>> (optionally) expose this extra catalog information to clients > and not need > > >>>>>>> to change how loading works. > > >>>>>>> > > >>>>>>> Ryan > > >>>>>>> > > >>>>>>> On Tue, May 20, 2025 at 9:45 AM Ryan Blue <[email protected]> > wrote: > > >>>>>>> > > >>>>>>>> Hi everyone, > > >>>>>>>> > > >>>>>>>> To avoid passing copies of a file around for comments, I put the > > >>>>>>>> doc for commit sequence numbers into Google so we can comment > on a central > > >>>>>>>> copy: > > >>>>>>>> > https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100239850723655533404&rtpof=true&sd=true > > >>>>>>>> > > >>>>>>>> Ryan > > >>>>>>>> > > >>>>>>>> On Fri, May 16, 2025 at 2:51 AM Maninderjit Singh < > > >>>>>>>> [email protected]> wrote: > > >>>>>>>> > > >>>>>>>>> Thanks for the updated proposal Drew! > > >>>>>>>>> My preference for using the catalog authored timestamp is to > > >>>>>>>>> minimize changes to the REST spec so we can have good backwards > > >>>>>>>>> compatibility. I have quickly put together a draft proposal on > how this > > >>>>>>>>> should work. Looking forward to feedback and discussion. > > >>>>>>>>> > > >>>>>>>>> Draft Proposal: Catalog‑Authored Timestamps for Apache Iceberg > > >>>>>>>>> REST Catalog > > >>>>>>>>> < > https://drive.google.com/open?id=1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE > > > > >>>>>>>>> > > >>>>>>>>> Thanks, > > >>>>>>>>> Maninder > > >>>>>>>>> > > >>>>>>>>> On Wed, May 14, 2025 at 6:12 PM Drew <[email protected]> wrote: > > >>>>>>>>> > > >>>>>>>>>> Hi everyone, > > >>>>>>>>>> > > >>>>>>>>>> Thank you for feedback on the MTT proposal and during > community > > >>>>>>>>>> sync. Based on it, Jagdeep and I have iterated on the > document and added a > > >>>>>>>>>> second option to use *Catalog CommitSequenceNumbers*. Looking > > >>>>>>>>>> forward to getting more feedback on the proposal, where to > add more details > > >>>>>>>>>> or approach/changes to consider. We appreciate everyone's > time on this! > > >>>>>>>>>> > > >>>>>>>>>> The option introduces *Catalog CommitSequenceNumbers(CSNs)*, > > >>>>>>>>>> which allow clients/engines to read a consistent view of > multiple tables > > >>>>>>>>>> without needing to register a transaction context with the > catalog. This > > >>>>>>>>>> removes the need of registering a transaction context with > Catalog, thus > > >>>>>>>>>> removing the need of transaction bookkeeping on the catalog > side. For > > >>>>>>>>>> aborting transactions early, clients can use LoadTable with > and without CSN > > >>>>>>>>>> to figure out if there is already a conflicting write on any > of the tables > > >>>>>>>>>> being modified. Also removed the section where transactions > were staging > > >>>>>>>>>> commits on Catalog, and changed the proposal to align with > Eduard's PR > > >>>>>>>>>> around staging changes locally before commit ( > > >>>>>>>>>> https://github.com/apache/iceberg/pull/6948). > > >>>>>>>>>> > > >>>>>>>>>> Jagdeep also clarified in an example in a previous email > where a > > >>>>>>>>>> workload may require multi table snapshot isolation, even if > the tables are > > >>>>>>>>>> being updated without Multi-Table commit API. Though most MTT > transactions > > >>>>>>>>>> will commit using the multi table commit API. > > >>>>>>>>>> > > >>>>>>>>>> Maninder, for the approach of "common notion of time between > > >>>>>>>>>> clients and catalog" - I spent some time thinking about it, > but cannot find > > >>>>>>>>>> a feasible way to do this. Yes, the catalogs can use a high > precision > > >>>>>>>>>> clock, but clients cannot use Catalog Timestamp from API > calls to set local > > >>>>>>>>>> clock due to network latency for request/response. For > example, different > > >>>>>>>>>> requests to the same Catalog servers can return different > timestamps based > > >>>>>>>>>> on network latency. Also what if a client works with more > than 1 Catalog. > > >>>>>>>>>> If you want to do a rough write-up or share a reference > implementation that > > >>>>>>>>>> uses such an approach, I will be happy to brainstorm it more. > Let us know! > > >>>>>>>>>> > > >>>>>>>>>> Here is the link to updated proposal > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>> < > https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100384647237395649950&rtpof=true&sd=true > > > > >>>>>>>>>> Thanks Again! > > >>>>>>>>>> - Drew > > >>>>>>>>>> > > >>>>>>>>> >
