Re: Re: [Proposal] Replicating version-hint onto the file system

Fokko Driesprong Fri, 15 Nov 2024 04:56:32 -0800

Hey Ashvin,

Thanks for taking the time to write up the proposal.


I have one big question that we need to clarify first. Many implementations
out there today expect that the location
<https://iceberg.apache.org/spec/#table-metadata> is unique to the table,
but this isn't called out in the spec explicitly. To give some context,
maintenance operations are running to keep the location clean from dangling
files. It happens that processes fail while writing files out, this way
there are files in the storage that are not referenced in the metadata
tree. It is best practice to run remove-orphan-files
<https://iceberg.apache.org/docs/nightly/spark-procedures/#remove_orphan_files>
periodically to clean up these files to avoid unnecessary storage costs.
This will list the whole metadata tree and compare that with the contents
under the location property of the metadata. This could potentially delete
files that are associated with another table when the location is shared.
Operations like these become increasingly important when more metadata will
be written as suggested in the proposal.

Kind regards,
Fokko


Op do 14 nov 2024 om 10:49 schreef lisoda <lis...@yeah.net>:

> Hello Team.
>
> I am delighted that the Iceberg community has brought up this matter. We
> have always believed that providing operability for Iceberg tables based on
> the file system is a very valuable feature. At the very least, we should
> allow users to read Iceberg tables solely through the file system. As a
> competitor to Iceberg, Apache Paimoin has also adopted a similar approach.
> It uses a set of file system-based operations to manage catalog and uses it
> to achieve interoperability across multiple engines/boundaries.
> Furthermore, in practice, we have also explored solutions for catalog
> management based on S3/DFS/local file systems. We only use a limited range
> of list operations and append operations within the file system to complete
> catalog management, eliminating all dependencies on operations like rename
> that do not have multi-file system consistency. Through this design, we
> have achieved reliable catalog management on object storage such as S3. If
> possible, after refining our prototype, we would like to contribute it to
> Iceberg.
>
> Tks.
> Lisoda.
>
>
>
>
>
>
> 在 2024-11-14 02:24:50，"Marc Cenac" <marc.ce...@datadoghq.com.INVALID> 写道：
>
> Thanks for the proposal Ashvin!
>
> I see value in adding this to support the use case of allowing read only
> access from Snowflake.  Currently we push updates with an ALTER TABLE
> command
> <https://docs.snowflake.com/en/sql-reference/sql/alter-iceberg-table-refresh> 
> to
> synchronize our internally-hosted catalog with Snowflake so a version-hint
> file would potentially eliminate this need.
>
> One question I have is "how could we prevent the version-hint file from
> being removed during the delete orphan files procedure?"  If version-hint
> is an optional file that is not tracked in the table's metadata, it seems
> this file could be removed during table maintenance.
>
> On Mon, Nov 11, 2024 at 2:03 PM Ashvin A <ash...@apache.org> wrote:
>
>> Hello Community,
>>
>> We would like to share a proposal to standardize a file system based
>> method to identify Iceberg tables’ current snapshot.
>>
>> Proposal doc: Adding a File System based Consistent Method to Identify
>> Iceberg Tables’ Current Snapshot
>> <https://docs.google.com/document/d/1yzLXSOtzBXyaWHfeVsWsMu4xmOH8rV6QyM5ZAnJZjMQ/edit?pli=1&tab=t.0#heading=h.yhvnt89pggpj>
>>
>> The proposal aims to enhance the interoperability and self-sufficiency of
>> Iceberg tables by replicating the snapshot's metadata file name
>> (version-hint) from the catalog to the file system. This will make the
>> table representation on the file system complete and eliminate the need for
>> catalog dependency in certain read-only scenarios.
>>
>> Use Case: Microsoft Fabric now supports Iceberg tables in OneLake,
>> allowing users to leverage Iceberg tables in addition to Delta Lake tables
>> with Microsoft Fabric’s compute engines. Having a file system based
>> integration reduces the number of components required in the read query
>> execution path, especially when the catalog is inaccessible or during
>> pre-production scenarios.
>>
>> Please review the proposal document and share your suggestions in the
>> comments. We look forward to discussing this further.
>>
>> Best,
>> Ashvin
>>
>

Re: Re: [Proposal] Replicating version-hint onto the file system

Reply via email to