Hi,

I understand that files in Iceberg tables are immutable. However, one of
our use-cases requires modifying a Parquet file belonging to an Iceberg
table, and I am trying to figure out how to support this.

Will performing an Iceberg transaction that first deletes the file and adds
it back work?

The spec contains the following:

   1. Technically, data files can be deleted when the last snapshot that
   contains the file as “live” data is garbage collected. But this is harder
   to detect and requires finding the diff of multiple snapshots. It is easier
   to track what files are deleted in a snapshot and delete them when that
   snapshot expires.

>From the above, it looks like deleting a file and adding it back as 2
separate transactions will not work. The file can be garbage collected when
the transaction that did the delete expires.

Is there a way to delete a file and add it back in the same transaction?

Thanks
Vivek

Reply via email to