Background:

1) Position Deletes


Writers determine what rows are deleted and mark them in a 1 for 1
representation. With delete vectors this means every data file has at most
1 delete vector that it is read in conjunction with to excise deleted rows.
Reader overhead is more or less constant and is very predictable.


The main cost of this mode is that deletes must be determined at write time
which is expensive and can be more difficult for conflict resolution

2) Equality Deletes

Writers write out reference to what values are deleted (in a partition or
globally). There can be an unlimited number of equality deletes and they
all must be checked for every data file that is read. The cost of
determining deleted rows is essentially given to the reader.

Conflicts almost never happen since data files are not actually changed and
there is almost no cost to the writer to generate these. Almost all costs
related to equality deletes are passed on to the reader.

Proposal:

Equality deletes are, in my opinion, unsustainable and we should work on
deprecating and removing them from the specification. At this time, I know
of only one engine (Apache Flink) which produces these deletes but almost
all engines have implementations to read them. The cost of implementing
equality deletes on the read path is difficult and unpredictable in terms
of memory usage and compute complexity. We’ve had suggestions of
implementing rocksdb inorder to handle ever growing sets of equality
deletes which in my opinion shows that we are going down the wrong path.

Outside of performance, Equality deletes are also difficult to use in
conjunction with many other features. For example, any features requiring
CDC or Row lineage are basically impossible when equality deletes are in
use. When Equality deletes are present, the state of the table can only be
determined with a full scan making it difficult to update differential
structures. This means materialized views or indexes need to essentially be
fully rebuilt whenever an equality delete is added to the table.

Equality deletes essentially remove complexity from the write side but then
add what I believe is an unacceptable level of complexity to the read side.

Because of this I suggest we deprecate Equality Deletes in V3 and slate
them for full removal from the Iceberg Spec in V4.

I know this is a big change and compatibility breakage so I would like to
introduce this idea to the community and solicit feedback from all
stakeholders. I am very flexible on this issue and would like to hear the
best issues both for and against removal of Equality Deletes.

Thanks everyone for your time,

Russ Spitzer

Reply via email to