[Proposal] Replicating version-hint onto the file system

Ashvin A Mon, 11 Nov 2024 12:03:47 -0800

Hello Community,

We would like to share a proposal to standardize a file system based method
to identify Iceberg tables’ current snapshot.


Proposal doc: Adding a File System based Consistent Method to Identify
Iceberg Tables’ Current Snapshot
<https://docs.google.com/document/d/1yzLXSOtzBXyaWHfeVsWsMu4xmOH8rV6QyM5ZAnJZjMQ/edit?pli=1&tab=t.0#heading=h.yhvnt89pggpj>

The proposal aims to enhance the interoperability and self-sufficiency of
Iceberg tables by replicating the snapshot's metadata file name
(version-hint) from the catalog to the file system. This will make the
table representation on the file system complete and eliminate the need for
catalog dependency in certain read-only scenarios.

Use Case: Microsoft Fabric now supports Iceberg tables in OneLake, allowing
users to leverage Iceberg tables in addition to Delta Lake tables with
Microsoft Fabric’s compute engines. Having a file system based integration
reduces the number of components required in the read query execution path,
especially when the catalog is inaccessible or during pre-production
scenarios.

Please review the proposal document and share your suggestions in the
comments. We look forward to discussing this further.

Best,
Ashvin

[Proposal] Replicating version-hint onto the file system

Reply via email to