I am fairly strongly opposed to repurposing the timestamp field for this.
To move forward, I'd recommend working on catalog sequence numbers.

On Sat, Nov 8, 2025 at 6:54 PM Dov Alperin
<[email protected]> wrote:

> Hi Iceberg community!
> (I initially opened this message as it's own thread in error, sorry about
> that)
> I’m curious where this proposal landed? I work at Materialize
> <http://materialize.com/> and we are keenly interested both in seeing this
> proposal come to fruition but possibly also helping to implement it.
>
> I see there was a call in May, but I’m not sure what the conclusion was. As
> spec v4 nears closer, I am curious which of the two proposals the community
> favors here?
>
> Best,
> Dov
>
> On Tue, May 27, 2025 at 01:09:05AM -0700, Maninderjit Singh wrote:
> > Forgot to attach a link to the update proposal
> > <
> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0#heading=h.ypbwvr181qn4
> >
> > .
> >
> > On Tue, May 27, 2025 at 1:06 AM Maninderjit Singh <
> > [email protected]> wrote:
> >
> > > Hi community,
> > >
> > >  I have updated the proposal with both the options (overwriting
> existing
> > > timestamps-ms vs introducing a new sequence/timestamp field) as we have
> > > initial consensus on using catalog authored sequence/timestamp.
> Jagdeep,
> > > please review to ensure that the options are correctly captured. I have
> > > also added additional arguments on why we can't assume timestamp to be
> > > "informational" since it's being used in critical paths and
> > > incorrect values can take the table offline.
> > >
> > > Also, I'm moving the meeting to Thursday to better accommodate
> conflicts.
> > > I would also record the meeting in case anyone misses and is
> interested in
> > > the discussion.
> > >
> > > Sync for iceberg multi-table transactions
> > > Thursday, May 29 · 9:00 – 10:00am
> > > Time zone: America/Los_Angeles
> > > Google Meet joining info
> > > Video call link: https://meet.google.com/ffc-ttjs-vti
> > >
> > > Thanks,
> > > Maninder
> > >
> > >
> > >
> > > On Mon, May 26, 2025 at 12:47 AM Péter Váry <
> [email protected]>
> > > wrote:
> > >
> > >> I'm interested, but can't be there, but please record the meeting.
> > >> Thanks,
> > >> Peter
> > >>
> > >> Maninderjit Singh <[email protected]> ezt írta (időpont:
> > >> 2025. máj. 24., Szo, 2:30):
> > >>
> > >>> Hi dev community,
> > >>> I was wondering if we could join a call next week for discussing the
> > >>> multi-table transactions so we can make progress. I have shared a
> meeting
> > >>> invite where anyone who's interested in the discussion can join.
> Please let
> > >>> me know if this works.
> > >>>
> > >>> Thanks,
> > >>> Maninder
> > >>>
> > >>> Sync for iceberg multi-table transactions
> > >>> Friday, May 30 · 9:00 – 10:00am
> > >>> Time zone: America/Los_Angeles
> > >>> Google Meet joining info
> > >>> Video call link: https://meet.google.com/ffc-ttjs-vti
> > >>>
> > >>>
> > >>> On Wed, May 21, 2025 at 10:25 AM Maninderjit Singh <
> > >>> [email protected]> wrote:
> > >>>
> > >>>> Hi dev community,
> > >>>> Following up on the thread here to continue the discussion and get
> > >>>> feedback since we couldn't get to it in sync. I think we have made
> some
> > >>>> progress in the discussion that I want to capture while
> highlighting the
> > >>>> items where we need to create consensus along with pros and cons. I
> would
> > >>>> need help to add clarity and to make sure the arguments are captured
> > >>>> correctly.
> > >>>>
> > >>>> *Things we agree on*
> > >>>>
> > >>>>    1. Don't maintain server side state for tracking the
> transactions.
> > >>>>    2. Need global (catalog-wide) ordering of snapshots via some
> > >>>>    (hybrid/logical) clock/CSN
> > >>>>    3. Optionally expose the catalog's clock/CSN information without
> > >>>>    changing how tables load
> > >>>>    4. Loading consistent snapshot across multiple tables and
> > >>>>    repeatable reads based on the reference clock/CSN
> > >>>>
> > >>>>
> > >>>> *Things we disagree on*
> > >>>>
> > >>>>    1. Reuse existing timestamp field vs introduce a new field CSN
> > >>>>
> > >>>>
> > >>>> *Reusing timestamp field approach*
> > >>>>
> > >>>>    - Pros:
> > >>>>
> > >>>>
> > >>>>    1. Backwards compatibility, no change to table metadata spec so
> > >>>>    could be used by existing v2 tables.
> > >>>>    2. Fixes existing time travel and ordering issues
> > >>>>    3. Simplifies and clarifies the spec (no new id for snapshots)
> > >>>>    4. Common notion of timestamp that could be used to evaluate
> causal
> > >>>>    relationships in other proposals like events or commit reports.
> > >>>>
> > >>>>
> > >>>>    - Cons
> > >>>>
> > >>>>
> > >>>>    1. Unique timestamp generation in milliseconds. Potential
> > >>>>    mitigations:
> > >>>>
> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&disco=AAABjwaxXeg
> > >>>>    2. Concerns about client side timestamp being overridden.
> > >>>>
> > >>>> *Adding new CSN field*
> > >>>>
> > >>>>    - Pros:
> > >>>>
> > >>>>
> > >>>>    1. Flexibility to use logical or hybrid clocks. Not sure how
> > >>>>    clients can generate a hybrid clock timestamp here without
> suffering from
> > >>>>    clock skew (Would be good to clarify this)?
> > >>>>    2. No client side overriding concerns.
> > >>>>
> > >>>>
> > >>>>    - Cons:
> > >>>>
> > >>>>
> > >>>>    1. Not backwards compatible, requires new field in table metadata
> > >>>>    so need to wait for v4
> > >>>>    2. Does not fix time travel and snapshot-log ordering issues
> > >>>>    3. Adds another id for snapshots that clients need to generate
> and
> > >>>>    reason about.
> > >>>>    4. Could not be extended to use in other proposals for causal
> > >>>>    reasoning.
> > >>>>
> > >>>>
> > >>>> Thanks,
> > >>>> Maninder
> > >>>>
> > >>>> On Tue, May 20, 2025 at 8:16 PM Maninderjit Singh <
> > >>>> [email protected]> wrote:
> > >>>>
> > >>>>> Appreciate the feedback on the "catalog-authored timestamp"
> document
> > >>>>> <
> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0
> >
> > >>>>> !
> > >>>>>
> > >>>>> Ryan, I don't think we can get consistent time travel queries in
> > >>>>> iceberg without fixing the timestamp field since it's what the spec
> > >>>>> <https://iceberg.apache.org/spec/#point-in-time-reads-time-travel>
> > >>>>> prescribes for time travel. Hence I took the liberty to re-use it
> for the
> > >>>>> catalog timestamp which ensures that snapshot-log is correctly
> ordered for
> > >>>>> time travel.  Additionally, the timestamp field needs to be fixed
> to avoid
> > >>>>> breaking commits to the table due to accidental large skews as per
> current
> > >>>>> spec, the scenario is described in detail here
> > >>>>> <
> https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?pli=1&tab=t.0#bookmark=id.6avx66vzo168
> >
> > >>>>> .
> > >>>>> The other benefit of reusing the timestamp field is spec simplicity
> > >>>>> and clarity on timestamp generation responsibilities without
> requiring the
> > >>>>> need to manage yet another identifier (in addition to sequence
> number,
> > >>>>> snapshot id and timestamp) for snapshots.
> > >>>>>
> > >>>>> Jagdeep, your concerns about overriding the timestamp field are
> valid
> > >>>>> but the reason I'm not too worried about it is because client
> can't assume
> > >>>>> a commit is successful without their response being acknowledged
> by the
> > >>>>> catalog which returns the CommitTableResponse
> > >>>>> <
> https://github.com/apache/iceberg/blob/c2478968e65368c61799d8ca4b89506a61ca3e7c/open-api/rest-catalog-open-api.yaml#L3997>
> with
> > >>>>> new metadata (that has catalog authored timestamps in the
> proposal). I'm
> > >>>>> happy to work with you to put something common together and get
> the best
> > >>>>> out of the proposals.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Maninder
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Tue, May 20, 2025 at 5:48 PM Jagdeep Sidhu <
> [email protected]>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Thank you Ryan, Maninder and the rest of the community for
> feedback
> > >>>>>> and ideas!
> > >>>>>> Drew and I will take another pass and remove the catalog
> > >>>>>> co-ordination requirement for LoadTable API, and bring the
> proposal closer
> > >>>>>> to "catalog-authored timestamp" in the sense that clients can use
> CSN to
> > >>>>>> find the right snapshot, but still leave upto Catalog on what it
> want to
> > >>>>>> use for CSN (Hybrid clock timestamp or another monotonically
> increasing
> > >>>>>> number).
> > >>>>>>
> > >>>>>> If more folks have feedback, please leave it in the doc or email
> > >>>>>> list, so we can address it as well in the document update.
> > >>>>>>
> > >>>>>> Maninder, one reason we proposed a new field for
> CommitSequenceNumber
> > >>>>>> instead of using an existing field is for backwards
> compatibility. Catalogs
> > >>>>>> can start optionally exposing the new field, and interested
> clients can use
> > >>>>>> the new field, but existing clients keep working as is. Existing
> and new
> > >>>>>> clients can also keep working as is against the same tables in the
> > >>>>>> same Catalog. My one worry is that having Catalog override the
> timestamp
> > >>>>>> field for commits may break some existing clients? Today all
> Iceberg
> > >>>>>> engines/clients do not expect the timestamp field in
> metadata/snapshot-log
> > >>>>>> to be overwritten by the Catalog.
> > >>>>>>
> > >>>>>> How do you feel about taking the best from each proposal?, i.e.
> > >>>>>> monotonically increasing commit sequence numbers (some catalogs
> can use
> > >>>>>> timestamps, some can use logical clock but we don't have to
> enforce it -
> > >>>>>> leave it up to Catalog), but keep client side logic for resolving
> the right
> > >>>>>> snapshot using sequence numbers instead of adding that
> functionality to
> > >>>>>> Catalog. Let me know!
> > >>>>>>
> > >>>>>> Thank you!
> > >>>>>> -Jagdeep
> > >>>>>>
> > >>>>>> On Tue, May 20, 2025 at 2:45 PM Ryan Blue <[email protected]>
> wrote:
> > >>>>>>
> > >>>>>>> Thanks for the proposals! There are things that I think are good
> > >>>>>>> about both of them. I think that the catalog-authored timestamps
> proposal
> > >>>>>>> misunderstands the purpose of the timestamp field, but does get
> right that
> > >>>>>>> a monotonically increasing "time" field (really a sequence
> number) across
> > >>>>>>> tables enables the coordination needed for snapshot isolated
> reads. I like
> > >>>>>>> that the sequence number proposal leaves the meaning of the
> field to the
> > >>>>>>> catalog for coordination, but it still proposes catalog
> coordination by
> > >>>>>>> loading tables "at" some sequence number. Ideally, we would be
> able to
> > >>>>>>> (optionally) expose this extra catalog information to clients
> and not need
> > >>>>>>> to change how loading works.
> > >>>>>>>
> > >>>>>>> Ryan
> > >>>>>>>
> > >>>>>>> On Tue, May 20, 2025 at 9:45 AM Ryan Blue <[email protected]>
> wrote:
> > >>>>>>>
> > >>>>>>>> Hi everyone,
> > >>>>>>>>
> > >>>>>>>> To avoid passing copies of a file around for comments, I put the
> > >>>>>>>> doc for commit sequence numbers into Google so we can comment
> on a central
> > >>>>>>>> copy:
> > >>>>>>>>
> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100239850723655533404&rtpof=true&sd=true
> > >>>>>>>>
> > >>>>>>>> Ryan
> > >>>>>>>>
> > >>>>>>>> On Fri, May 16, 2025 at 2:51 AM Maninderjit Singh <
> > >>>>>>>> [email protected]> wrote:
> > >>>>>>>>
> > >>>>>>>>> Thanks for the updated proposal Drew!
> > >>>>>>>>> My preference for using the catalog authored timestamp is to
> > >>>>>>>>> minimize changes to the REST spec so we can have good backwards
> > >>>>>>>>> compatibility. I have quickly put together a draft proposal on
> how this
> > >>>>>>>>> should work. Looking forward to feedback and discussion.
> > >>>>>>>>>
> > >>>>>>>>>  Draft Proposal: Catalog‑Authored Timestamps for Apache Iceberg
> > >>>>>>>>> REST Catalog
> > >>>>>>>>> <
> https://drive.google.com/open?id=1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE
> >
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> Maninder
> > >>>>>>>>>
> > >>>>>>>>> On Wed, May 14, 2025 at 6:12 PM Drew <[email protected]> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi everyone,
> > >>>>>>>>>>
> > >>>>>>>>>> Thank you for feedback on the MTT proposal and during
> community
> > >>>>>>>>>> sync. Based on it, Jagdeep and I have iterated on the
> document and added a
> > >>>>>>>>>> second option to use *Catalog CommitSequenceNumbers*. Looking
> > >>>>>>>>>> forward to getting more feedback on the proposal, where to
> add more details
> > >>>>>>>>>> or approach/changes to consider. We appreciate everyone's
> time on this!
> > >>>>>>>>>>
> > >>>>>>>>>> The option introduces *Catalog CommitSequenceNumbers(CSNs)*,
> > >>>>>>>>>> which allow clients/engines to read a consistent view of
> multiple tables
> > >>>>>>>>>> without needing to register a transaction context with the
> catalog. This
> > >>>>>>>>>> removes the need of registering a transaction context with
> Catalog, thus
> > >>>>>>>>>> removing the need of transaction bookkeeping on the catalog
> side. For
> > >>>>>>>>>> aborting transactions early, clients can use LoadTable with
> and without CSN
> > >>>>>>>>>> to figure out if there is already a conflicting write on any
> of the tables
> > >>>>>>>>>> being modified. Also removed the section where transactions
> were staging
> > >>>>>>>>>> commits on Catalog, and changed the proposal to align with
> Eduard's PR
> > >>>>>>>>>> around staging changes locally before commit (
> > >>>>>>>>>> https://github.com/apache/iceberg/pull/6948).
> > >>>>>>>>>>
> > >>>>>>>>>> Jagdeep also clarified in an example in a previous email
> where a
> > >>>>>>>>>> workload may require multi table snapshot isolation, even if
> the tables are
> > >>>>>>>>>> being updated without Multi-Table commit API. Though most MTT
> transactions
> > >>>>>>>>>> will commit using the multi table commit API.
> > >>>>>>>>>>
> > >>>>>>>>>> Maninder, for the approach of "common notion of time between
> > >>>>>>>>>> clients and catalog" - I spent some time thinking about it,
> but cannot find
> > >>>>>>>>>> a feasible way to do this. Yes, the catalogs can use a high
> precision
> > >>>>>>>>>> clock, but clients cannot use Catalog Timestamp from API
> calls to set local
> > >>>>>>>>>> clock due to network latency for request/response. For
> example, different
> > >>>>>>>>>> requests to the same Catalog servers can return different
> timestamps based
> > >>>>>>>>>> on network latency. Also what if a client works with more
> than 1 Catalog.
> > >>>>>>>>>> If you want to do a rough write-up or share a reference
> implementation that
> > >>>>>>>>>> uses such an approach, I will be happy to brainstorm it more.
> Let us know!
> > >>>>>>>>>>
> > >>>>>>>>>> Here is the link to updated proposal
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> <
> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100384647237395649950&rtpof=true&sd=true
> >
> > >>>>>>>>>> Thanks Again!
> > >>>>>>>>>> - Drew
> > >>>>>>>>>>
> > >>>>>>>>>
>

Reply via email to