Hi, Thank you Russell for bringing up this topic and nice write-up. >From perspective of engines like Trino, equality deletes bring little value and add lot complications, so +1 from me on this.
I understand they exist for a reason though. Maybe it was just a lazy choice that we should just revisit, or maybe it's something fundamental. Before deprecating, let's make sure we will really be able to remove them, eg the Flink use-case can be addressed without them. Best Piotr On Thu, 31 Oct 2024 at 04:10, Rodrigo Meneses <rmene...@gmail.com> wrote: > I have a very basic question. Is there already an alternative to equality > deletes for flink? If we deprecate the feature, is because there’s already > an alternative correct ? > > On Wed, Oct 30, 2024 at 7:50 PM Manu Zhang <owenzhang1...@gmail.com> > wrote: > >> I think Apache Paimon could point us in the direction of supporting >> streaming upserts use cases. We are already working on some of the building >> blocks like deletion vectors and Flink compaction. >> >> +1 to the proposal since users are not recommended to use equality >> deletes for streaming upserts anyway. >> >> [1] >> https://medium.com/@ipolyzos_/apache-paimon-introducing-deletion-vectors-584666ee90de >> >> >> On Thu, Oct 31, 2024 at 10:16 AM Ajantha Bhat <ajanthab...@gmail.com> >> wrote: >> >>> Equality deletes aren't only written from Flink; Iceberg Kafka Connect >>> (Tabular’s version) also writes equality deletes for upserts. >>> >>> Writers write out reference to what values are deleted (in a partition >>>> or globally). There can be an unlimited number of equality deletes and they >>>> all must be checked for every data file that is read. The cost of >>>> determining deleted rows is essentially given to the reader >>> >>> >>> Should we focus on optimizing these by compacting them into a single >>> file to reduce read overhead? >>> What are the plans for supporting streaming writes in Iceberg if we move >>> away from equality deletes? Can we achieve real-time writing with position >>> deletes instead, or would this impact write performance? >>> >>> - Ajantha >>> >>> On Thu, Oct 31, 2024 at 7:18 AM Gang Wu <ust...@gmail.com> wrote: >>> >>>> Thanks Russell for bringing this up! >>>> >>>> +1 on deprecating equality deletes. >>>> >>>> IMHO, this is something that should reside only in the ingestion engine. >>>> >>>> Best, >>>> Gang >>>> >>>> On Thu, Oct 31, 2024 at 5:07 AM Russell Spitzer < >>>> russell.spit...@gmail.com> wrote: >>>> >>>>> Background: >>>>> >>>>> 1) Position Deletes >>>>> >>>>> >>>>> Writers determine what rows are deleted and mark them in a 1 for 1 >>>>> representation. With delete vectors this means every data file has at most >>>>> 1 delete vector that it is read in conjunction with to excise deleted >>>>> rows. >>>>> Reader overhead is more or less constant and is very predictable. >>>>> >>>>> >>>>> The main cost of this mode is that deletes must be determined at write >>>>> time which is expensive and can be more difficult for conflict resolution >>>>> >>>>> 2) Equality Deletes >>>>> >>>>> Writers write out reference to what values are deleted (in a partition >>>>> or globally). There can be an unlimited number of equality deletes and >>>>> they >>>>> all must be checked for every data file that is read. The cost of >>>>> determining deleted rows is essentially given to the reader. >>>>> >>>>> Conflicts almost never happen since data files are not actually >>>>> changed and there is almost no cost to the writer to generate these. >>>>> Almost >>>>> all costs related to equality deletes are passed on to the reader. >>>>> >>>>> Proposal: >>>>> >>>>> Equality deletes are, in my opinion, unsustainable and we should work >>>>> on deprecating and removing them from the specification. At this time, I >>>>> know of only one engine (Apache Flink) which produces these deletes but >>>>> almost all engines have implementations to read them. The cost of >>>>> implementing equality deletes on the read path is difficult and >>>>> unpredictable in terms of memory usage and compute complexity. We’ve had >>>>> suggestions of implementing rocksdb inorder to handle ever growing sets of >>>>> equality deletes which in my opinion shows that we are going down the >>>>> wrong >>>>> path. >>>>> >>>>> Outside of performance, Equality deletes are also difficult to use in >>>>> conjunction with many other features. For example, any features requiring >>>>> CDC or Row lineage are basically impossible when equality deletes are in >>>>> use. When Equality deletes are present, the state of the table can only be >>>>> determined with a full scan making it difficult to update differential >>>>> structures. This means materialized views or indexes need to essentially >>>>> be >>>>> fully rebuilt whenever an equality delete is added to the table. >>>>> >>>>> Equality deletes essentially remove complexity from the write side but >>>>> then add what I believe is an unacceptable level of complexity to the read >>>>> side. >>>>> >>>>> Because of this I suggest we deprecate Equality Deletes in V3 and >>>>> slate them for full removal from the Iceberg Spec in V4. >>>>> >>>>> I know this is a big change and compatibility breakage so I would like >>>>> to introduce this idea to the community and solicit feedback from all >>>>> stakeholders. I am very flexible on this issue and would like to hear the >>>>> best issues both for and against removal of Equality Deletes. >>>>> >>>>> Thanks everyone for your time, >>>>> >>>>> Russ Spitzer >>>>> >>>>>