Hi, I have some questions on the current Iceberg spec regarding equality deletes: https://iceberg.apache.org/spec/#equality-delete-files The spec says that for "a table with the following data:
<https://iceberg.apache.org/spec/#__codelineno-1-1> 1: id | 2: category | 3: name <https://iceberg.apache.org/spec/#__codelineno-1-2>-------|-------------|--------- <https://iceberg.apache.org/spec/#__codelineno-1-3> 1 | marsupial | Koala <https://iceberg.apache.org/spec/#__codelineno-1-4> 2 | toy | Teddy <https://iceberg.apache.org/spec/#__codelineno-1-5> 3 | NULL | Grizzly <https://iceberg.apache.org/spec/#__codelineno-1-6> 4 | NULL | Polar The delete id = 3 could be written as either of the following equality delete files: <https://iceberg.apache.org/spec/#__codelineno-2-1>equality_ids=[1] <https://iceberg.apache.org/spec/#__codelineno-2-2> <https://iceberg.apache.org/spec/#__codelineno-2-3> 1: id <https://iceberg.apache.org/spec/#__codelineno-2-4>------- <https://iceberg.apache.org/spec/#__codelineno-2-5> 3 equality_ids=[1] <https://iceberg.apache.org/spec/#__codelineno-3-2> <https://iceberg.apache.org/spec/#__codelineno-3-3> 1: id | 2: category | 3: name <https://iceberg.apache.org/spec/#__codelineno-3-4>-------|-------------|--------- <https://iceberg.apache.org/spec/#__codelineno-3-5> 3 | NULL | Grizzly " 1. Are the options either (a) write only the column(s) listed in equality_ids or (b) write all the columns? i.e, no in between. 2. If we write all the columns, are only columns listed in equality_ids considered? What happens if a non-equality_id column does not match? e.g., equality_ids=[1] <https://iceberg.apache.org/spec/#__codelineno-3-2> <https://iceberg.apache.org/spec/#__codelineno-3-3> 1: id | 2: category | 3: name <https://iceberg.apache.org/spec/#__codelineno-3-4>-------|-------------|--------- <https://iceberg.apache.org/spec/#__codelineno-3-5> 3 | NULL | Polar Is that (a) invalid, or does that (b) still result in deleting id = 3, or (c) result in deleting no rows? The spec says "Each row of the delete file produces one equality predicate that matches any row where the delete columns are equal. Multiple columns can be thought of as an AND of equality predicates." That could be interpreted to mean (c). Thanks, Wing Yew