I'm working on incremental ingest of Iceberg tables into SingleStore. I know this is an active area of work in the Iceberg community, as it's very similar to materialized views, only the "MV" in the case of ingesting into another system is a trivial one. But for Iceberg v2, I have some questions about when delete files can go away. In particular, I'm thinking about overlapping equality deletes.
The case I'm worried about is that an equality delete file is removed, representing some rows being restored. But perhaps there is some other equality delete which overlaps with the dropped delete. This would happen only for equality deletes on non-unique columns, and only if you have equality deletes which are applied to different columns, both of which seem like a bad idea but are allowed by the spec. As I understand it, a delete operation can *only* add remove data files or add delete files, it cannot remove delete files. And of course an append can only add data files. Given that, it seems to me that there are two ways that delete files could be in one snapshot and not appear in a subsequent one: 1. During a replace operation, a delete file may be partly or fully applied, and new data and delete files generated. For instance, if snapshot N has a data file with records with ids = [1, 2, 3, 4] and a equality delete file on the id column with values [1, 2], a new snapshot N+1 might have a data file with ids [3, 4] and no delete files, or it might have a data file with ids [2, 3, 4] and a delete file for ids [2]. The second case is a partial compaction and seems legal, if a little unusual on its face. 2. During an overwrite operation, anything could happen, including the revert of a delete (dropping of a delete file from the table). This could occur due to reverting to an earlier snapshot in a system that writes to the table. This could also be done by writing an older value into current-snapshot-id in the table metadata, but that isn't the only way to represent a revert. I think that those are the only two ways that an equality delete file can go away. Am I missing one? And what are other implementers doing here? Are there implementations out there which will remove equality deletes to represent a row being restored? — Michael Leuchtenburg Staff Software Engineer m. 413.433.0739 Try SingleStore Free <https://www.singlestore.com/managed-service-trial/?utm_source=emailsiglink> [image: SingleStore] <https://www.singlestore.com/managed-service-trial/?utm_source=emailsig>