Hi,

Thank you Russell for bringing up this topic and nice write-up.
>From perspective of engines like Trino, equality deletes bring little value
and add lot complications, so +1 from me on this.

I understand they exist for a reason though. Maybe it was just a lazy
choice that we should just revisit, or maybe it's something fundamental.
Before deprecating, let's make sure we will really be able to remove them,
eg the Flink use-case can be addressed without them.

Best
Piotr


On Thu, 31 Oct 2024 at 04:10, Rodrigo Meneses <rmene...@gmail.com> wrote:

> I have a very basic question. Is there already an alternative to equality
> deletes for flink? If we deprecate the feature, is because there’s already
> an alternative correct ?
>
> On Wed, Oct 30, 2024 at 7:50 PM Manu Zhang <owenzhang1...@gmail.com>
> wrote:
>
>> I think Apache Paimon could point us in the direction of supporting
>> streaming upserts use cases. We are already working on some of the building
>> blocks like deletion vectors and Flink compaction.
>>
>> +1 to the proposal since users are not recommended to use equality
>> deletes for streaming upserts anyway.
>>
>> [1]
>> https://medium.com/@ipolyzos_/apache-paimon-introducing-deletion-vectors-584666ee90de
>>
>>
>> On Thu, Oct 31, 2024 at 10:16 AM Ajantha Bhat <ajanthab...@gmail.com>
>> wrote:
>>
>>> Equality deletes aren't only written from Flink; Iceberg Kafka Connect
>>> (Tabular’s version) also writes equality deletes for upserts.
>>>
>>> Writers write out reference to what values are deleted (in a partition
>>>> or globally). There can be an unlimited number of equality deletes and they
>>>> all must be checked for every data file that is read. The cost of
>>>> determining deleted rows is essentially given to the reader
>>>
>>>
>>> Should we focus on optimizing these by compacting them into a single
>>> file to reduce read overhead?
>>> What are the plans for supporting streaming writes in Iceberg if we move
>>> away from equality deletes? Can we achieve real-time writing with position
>>> deletes instead, or would this impact write performance?
>>>
>>> - Ajantha
>>>
>>> On Thu, Oct 31, 2024 at 7:18 AM Gang Wu <ust...@gmail.com> wrote:
>>>
>>>> Thanks Russell for bringing this up!
>>>>
>>>> +1 on deprecating equality deletes.
>>>>
>>>> IMHO, this is something that should reside only in the ingestion engine.
>>>>
>>>> Best,
>>>> Gang
>>>>
>>>> On Thu, Oct 31, 2024 at 5:07 AM Russell Spitzer <
>>>> russell.spit...@gmail.com> wrote:
>>>>
>>>>> Background:
>>>>>
>>>>> 1) Position Deletes
>>>>>
>>>>>
>>>>> Writers determine what rows are deleted and mark them in a 1 for 1
>>>>> representation. With delete vectors this means every data file has at most
>>>>> 1 delete vector that it is read in conjunction with to excise deleted 
>>>>> rows.
>>>>> Reader overhead is more or less constant and is very predictable.
>>>>>
>>>>>
>>>>> The main cost of this mode is that deletes must be determined at write
>>>>> time which is expensive and can be more difficult for conflict resolution
>>>>>
>>>>> 2) Equality Deletes
>>>>>
>>>>> Writers write out reference to what values are deleted (in a partition
>>>>> or globally). There can be an unlimited number of equality deletes and 
>>>>> they
>>>>> all must be checked for every data file that is read. The cost of
>>>>> determining deleted rows is essentially given to the reader.
>>>>>
>>>>> Conflicts almost never happen since data files are not actually
>>>>> changed and there is almost no cost to the writer to generate these. 
>>>>> Almost
>>>>> all costs related to equality deletes are passed on to the reader.
>>>>>
>>>>> Proposal:
>>>>>
>>>>> Equality deletes are, in my opinion, unsustainable and we should work
>>>>> on deprecating and removing them from the specification. At this time, I
>>>>> know of only one engine (Apache Flink) which produces these deletes but
>>>>> almost all engines have implementations to read them. The cost of
>>>>> implementing equality deletes on the read path is difficult and
>>>>> unpredictable in terms of memory usage and compute complexity. We’ve had
>>>>> suggestions of implementing rocksdb inorder to handle ever growing sets of
>>>>> equality deletes which in my opinion shows that we are going down the 
>>>>> wrong
>>>>> path.
>>>>>
>>>>> Outside of performance, Equality deletes are also difficult to use in
>>>>> conjunction with many other features. For example, any features requiring
>>>>> CDC or Row lineage are basically impossible when equality deletes are in
>>>>> use. When Equality deletes are present, the state of the table can only be
>>>>> determined with a full scan making it difficult to update differential
>>>>> structures. This means materialized views or indexes need to essentially 
>>>>> be
>>>>> fully rebuilt whenever an equality delete is added to the table.
>>>>>
>>>>> Equality deletes essentially remove complexity from the write side but
>>>>> then add what I believe is an unacceptable level of complexity to the read
>>>>> side.
>>>>>
>>>>> Because of this I suggest we deprecate Equality Deletes in V3 and
>>>>> slate them for full removal from the Iceberg Spec in V4.
>>>>>
>>>>> I know this is a big change and compatibility breakage so I would like
>>>>> to introduce this idea to the community and solicit feedback from all
>>>>> stakeholders. I am very flexible on this issue and would like to hear the
>>>>> best issues both for and against removal of Equality Deletes.
>>>>>
>>>>> Thanks everyone for your time,
>>>>>
>>>>> Russ Spitzer
>>>>>
>>>>>

Reply via email to