Hi all, I wanted to discuss changing the default position delete file granularity for Spark from partition to file level for any newly created V2 tables. See this PR [1]
Context on delete file granularity: - Partition granularity: Writers group delete files for multiple data files from the same partition into the same delete file. This leads to fewer files on disk, but higher read amplification from reading delete information from irrelevant data files for a scan. - File granularity: Writers write a new delete file for every changed data file. More targeted reads of relevant delete information occur but this can lead to more files on disk. With the recent merge of synchronous position delete maintenance on write in Spark [2], file granularity as a default is more compelling since reads would be more targeted *and* files would be maintained on disk. I also recommend folks go through the deletion vector design doc for more details [3]. Note that for existing tables with high delete-to-data file ratios, Iceberg's rewrite position deletes procedure can compact the table and every subsequent write would continuously maintain the position deletes. Additionally note that in V3, at most one puffin position delete file is allowed per data file; what's being discussed here is changing the default granularity for new V2 tables since it should generally be better after the sync maintenance addition. What are folks' thoughts on this? [1] https://github.com/apache/iceberg/pull/11478 [2] https://github.com/apache/iceberg/pull/11273 [3] https://docs.google.com/document/d/18Bqhr-vnzFfQk1S4AgRISkA_5_m5m32Nnc2Cw0zn2XM/edit?tab=t.0#heading=h.193fl7s89tcg Thanks, Amogh Jahagirdar