Re: [Discuss] Global Snapshot Consistency for Iceberg Tables

Xiening Dai Fri, 05 Jun 2026 16:16:15 -0700

And I replied your comments in the doc. Thank you.


On 2026/06/04 23:35:04 Maninder Parmar wrote:
> Hi Xiening,
> The LoadTables proposal above seems to address the problem of atomically
> reading the metadata.json across multiple tables "as of" a consistent time,
> the CSN proposal provides a detailed
> <https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?tab=t.0#bookmark=id.ue33k3ujfi7s>explanation
> of how to achieve it.
> It does not require reading metadata.json N times for the single table or
> pinning the catalog state ( I have added comments and provided links to
> relevant sections). Also, there is no need to rewrite the artifacts
> (manifest/manifest lists) stored in cloud storage as the CSN lives only in
> the TableMetadata which is written only by the catalog for the REST
> catalogs.
> 
> The rest of the proposal aligns closely with the CSN proposal described here
> <https://docs.google.com/document/d/1KVgUJc1WgftHfLz118vMbEE7HV8_pUDk4s-GJFDyAOE/edit?tab=t.0#heading=h.nwyigim62nez>
> .
> 
> Thanks,
> Maninder
> 
> 
> 
> 
> 
> On Wed, Jun 3, 2026 at 8:59 AM Xiening Dai <[email protected]> wrote:
> 
> > Hi all,
> >
> > Today, the Iceberg spec has table properties defining the transaction
> > isolation levels: write.delete/update/merge.isolation-level. These
> > properties can be set to either `snapshot` or `serializable`. With a
> > properly designed writer and Iceberg multi version snapshots, we can
> > achieve single table snapshot isolation or even serializable isolation.
> >
> > But for queries involving multiple tables, the spec does not provide a
> > mechanism to achieve a global snapshot consistency. The Iceberg REST
> > Catalog (IRC) API provides only single-table load operation: LoadTable, and
> > clients would need to call this API multiple times to resolve table
> > metadata in a single query statement - each could represent a different
> > snapshot view of the catalog.
> >
> > This creates problem especially for engines that already support global
> > SI. For example, the transaction semantics for AWS Redshift when query its
> > native tables is different than querying against Iceberg tables, which
> > surprises customers at times.
> >
> > There were proposals in the past in the context of multi-statement
> > transaction discussion (
> > https://docs.google.com/document/d/1jr4Ah8oceOmo6fwxG_0II4vKDUHUKScb/edit#heading=h.qb9z621zr507).
> > But I feel these proposals are too complicated and require significant
> > changes to the catalog/IRC protocol.
> >
> > Here I propose a simpler approach: add a batch LoadTables API, and rely on
> > the catalog's underlying system-of-record to provide snapshot isolation for
> > that batch read.
> >
> > When a client calls LoadTables({table_a, table_b, table_c}), the catalog
> > reads the current metadata for all requested tables in a single consistent
> > operation (e.g., a TransactGetItems in DynamoDB, or a single SI read in a
> > relational DB). The client receives a consistent cross-table snapshot — the
> > latest committed state of all requested tables as of a single point in time.
> >
> > This would give us the statement level global snapshot consistency. It
> > doesn’t provide full transaction level SI consistency for multi statement
> > transactions, but I believe it’s a reasonable trade off.
> >
> > I capture the details of this proposal in this doc -
> > https://docs.google.com/document/d/1u11b4pzeFUKD0XX--nHPj-DoYcNeCgOe94WKCaX2XMI/edit?usp=sharing
> >
> > I also created a prototype that implements the LoadTables API for Apache
> > Polaris, levering the underlying Postgres for the snapshot isolation -
> > https://github.com/xndai/polaris/commit/f4eb514a2920effe67ecfb8c64e2e3fa418baf11
> >
> > Feedbacks and comments are welcomed!
> >
>

Re: [Discuss] Global Snapshot Consistency for Iceberg Tables

Reply via email to