Hi all,

I wanted to discuss changing the default position delete file granularity
for Spark from partition to file level for any newly created V2 tables. See
this PR [1]

Context on delete file granularity:

   - Partition granularity: Writers group delete files for multiple data
   files from the same partition into the same delete file. This leads to
   fewer files on disk, but higher read amplification from reading delete
   information from irrelevant data files for a scan.
   - File granularity: Writers write a new delete file for every changed
   data file. More targeted reads of relevant delete information occur but
   this can lead to more files on disk.

With the recent merge of synchronous position delete maintenance on write
in Spark [2], file granularity as a default is more compelling since reads
would be more targeted *and* files would be maintained on disk. I also
recommend folks go through the deletion vector design doc for more details
[3].

Note that for existing tables with high delete-to-data file ratios,
Iceberg's rewrite position deletes procedure can compact the table and
every subsequent write would continuously maintain the position deletes.
Additionally note that in V3, at most one puffin position delete file is
allowed per data file; what's being discussed here is changing the default
granularity for new V2 tables since it should generally be better after the
sync maintenance addition.

What are folks' thoughts on this?

[1] https://github.com/apache/iceberg/pull/11478
[2] https://github.com/apache/iceberg/pull/11273
[3]
https://docs.google.com/document/d/18Bqhr-vnzFfQk1S4AgRISkA_5_m5m32Nnc2Cw0zn2XM/edit?tab=t.0#heading=h.193fl7s89tcg

Thanks,

Amogh Jahagirdar

Reply via email to