Thanks for the proposals! There are things that I think are good about both
of them. I think that the catalog-authored timestamps proposal
misunderstands the purpose of the timestamp field, but does get right that
a monotonically increasing "time" field (really a sequence number) across
tables enables the coordination needed for snapshot isolated reads. I like
that the sequence number proposal leaves the meaning of the field to the
catalog for coordination, but it still proposes catalog coordination by
loading tables "at" some sequence number. Ideally, we would be able to
(optionally) expose this extra catalog information to clients and not need
to change how loading works.

Ryan

On Tue, May 20, 2025 at 9:45 AM Ryan Blue <rdb...@gmail.com> wrote:

> Hi everyone,
>
> To avoid passing copies of a file around for comments, I put the doc for
> commit sequence numbers into Google so we can comment on a central copy:
> https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100239850723655533404&rtpof=true&sd=true
>
> Ryan
>
> On Fri, May 16, 2025 at 2:51 AM Maninderjit Singh <
> parmar.maninder...@gmail.com> wrote:
>
>> Thanks for the updated proposal Drew!
>> My preference for using the catalog authored timestamp is to minimize
>> changes to the REST spec so we can have good backwards compatibility. I
>> have quickly put together a draft proposal on how this should work. Looking
>> forward to feedback and discussion.
>>
>>  Draft Proposal: Catalog‑Authored Timestamps for Apache Iceberg REST
>> Catalog
>> <https://drive.google.com/open?id=1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE>
>>
>> Thanks,
>> Maninder
>>
>> On Wed, May 14, 2025 at 6:12 PM Drew <img...@gmail.com> wrote:
>>
>>> Hi everyone,
>>>
>>> Thank you for feedback on the MTT proposal and during community sync.
>>> Based on it, Jagdeep and I have iterated on the document and added a second
>>> option to use *Catalog CommitSequenceNumbers*. Looking forward to
>>> getting more feedback on the proposal, where to add more details or
>>> approach/changes to consider. We appreciate everyone's time on this!
>>>
>>> The option introduces *Catalog CommitSequenceNumbers(CSNs)*, which
>>> allow clients/engines to read a consistent view of multiple tables without
>>> needing to register a transaction context with the catalog. This removes
>>> the need of registering a transaction context with Catalog, thus removing
>>> the need of transaction bookkeeping on the catalog side. For aborting
>>> transactions early, clients can use LoadTable with and without CSN to
>>> figure out if there is already a conflicting write on any of the tables
>>> being modified. Also removed the section where transactions were staging
>>> commits on Catalog, and changed the proposal to align with Eduard's PR
>>> around staging changes locally before commit (
>>> https://github.com/apache/iceberg/pull/6948).
>>>
>>> Jagdeep also clarified in an example in a previous email where a
>>> workload may require multi table snapshot isolation, even if the tables are
>>> being updated without Multi-Table commit API. Though most MTT transactions
>>> will commit using the multi table commit API.
>>>
>>> Maninder, for the approach of "common notion of time between clients and
>>> catalog" - I spent some time thinking about it, but cannot find a feasible
>>> way to do this. Yes, the catalogs can use a high precision clock, but
>>> clients cannot use Catalog Timestamp from API calls to set local clock due
>>> to network latency for request/response. For example, different requests to
>>> the same Catalog servers can return different timestamps based on network
>>> latency. Also what if a client works with more than 1 Catalog. If you want
>>> to do a rough write-up or share a reference implementation that uses such
>>> an approach, I will be happy to brainstorm it more. Let us know!
>>>
>>> Here is the link to updated proposal
>>>
>>>
>>> <https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit?usp=sharing&ouid=100384647237395649950&rtpof=true&sd=true>
>>> Thanks Again!
>>> - Drew
>>>
>>

Reply via email to