Hi Russell,

Thanks for supporting equality deletes to row lineage!

> accept that "updates" will be treated as "delete" and "insert"

I would say that it has obvious drawbacks below (though it is better than
not supported):
1) updates will be populated differently when outputting changelogs to
users or downstream databases
2) lead to more computation for incremental processing like refreshing
materialized views

At the same time, I would like to ask if it would help if we support
rewriting equality deletes to position deletes.
There was an effort but it has been closed:
https://github.com/apache/iceberg/pull/2216

Best,
Gang


On Wed, Feb 12, 2025 at 7:25 AM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> Hi Y'all,
>
> As we have been working on the row lineage implementation I've been
> reached out to by a few folks
>  in the community who are interested in changing our defined behavior
> around equality deletes.
>
> Currently when Row Lineage is enabled, the spec says to disable equality
> deletes for the table.
>
> In the interest of compatibility with Flink and other Equality delete
> producers, I originally wrote
> that we would simply treat all equality delete based updates as a pure
> insert and
> delete. At the time, some folks thought this was too open and worried that
> it would be poor behavior which
> led to the current restriction.
>
> Now that we are actually implementing I think there have been some changes
> of heart and that we
> should go back to the original design.  I'd like to see if we have
> consensus
> in the community to change the wording back and allow equality deletes.
>
> PR: https://github.com/apache/iceberg/pull/12230
>
> The TLDR;
>
> Allow equality deletes with row lineage but accept that "updates" will be
> treated as "delete" and "insert"
>
> Thanks for your time,
> Russ
>

Reply via email to