The situation in which you would use equality deletes is when you do not want to read the existing table data. That seems at odds with a feature like row-level tracking where you want to keep track. To me, it would be a reasonable solution to just say that equality deletes can't be used in tables where row-level tracking is enabled.
On Mon, Aug 19, 2024 at 11:34 AM Russell Spitzer <russell.spit...@gmail.com> wrote: > As far as I know Flink is actually the only engine we have at the moment > that can produce Equally deletes and only Equality deletes have this > specific problem. Since an equality delete can be written without actually > knowing whether rows are being updated or not, it is always ambiguous as to > whether a new row is an updated row, a newly added row, or a row which was > deleted but then a newly added row was also appended. > > I think in this case we need to ignore row_versioning and just give every > new row a brand new identifier. For a reader this means all updates look > like a "delete" and "add" and no "updates". For other processes (COW and > Position Deletes) we only mark records as being deleted or updated after > finding them first, this makes it easy to take the lineage identifier from > the source record and change it. For Spark, we just kept working on engine > improvements (like SPJ, Dynamic partition pushdown) to try to make that > scan and join faster but we probably still require a bit slower latency. > > I think we could theoretically resolve equality deletes into updates at > compaction time again but only if the user first defines accurate "row > identity" columns because otherwise we have no way of determining whether > rows were updated or not. This is basically the issue we have now in the > CDC procedures. Ideally, I think we need to find a way to have flink locate > updated rows at runtime using some better indexing structure or something > like that as you suggested. > > On Sat, Aug 17, 2024 at 1:07 AM Péter Váry <peter.vary.apa...@gmail.com> > wrote: > >> Hi Russell, >> >> As discussed offline, this would be very hard to implement with the >> current Flink CDC write strategies. I think this is true for every >> streaming writers. >> >> For tracking the previous version of the row, the streaming writer would >> need to scan the table. It needs to be done for every record to find the >> previous version. This could be possible if the data would be stored in a >> way which supports fast queries on the primary key, like LSM Tree (see: >> Paimon [1]), otherwise it would be prohibitively costly, and unfeasible for >> higher loads. So adding a new storage strategy could be one solution. >> >> Alternatively we might find a way for the compaction to update the >> lineage fields. We could provide a way to link the equality deletes to the >> new rows which updated them during write, then on compaction we could >> update the lineage fields based on this info. >> >> Is there any better ideas with Spark streaming which we can adopt? >> >> Thanks, >> Peter >> >> [1] - https://paimon.apache.org/docs/0.8/ >> >> On Sat, Aug 17, 2024, 01:06 Russell Spitzer <russell.spit...@gmail.com> >> wrote: >> >>> Hi Y'all, >>> >>> We've been working on a new proposal to add Row Lineage to Iceberg in >>> the V3 Spec. The general idea is to give every row a unique identifier as >>> well as a marker of what version of the row it is. This should let us build >>> a variety of features related to CDC, Incremental Processing and Audit >>> Logging. If you are interested please check out the linked proposal below. >>> This will require compliance from all engines to be really useful so It's >>> important we come to consensus on whether or not this is possible. >>> >>> >>> https://docs.google.com/document/d/146YuAnU17prnIhyuvbCtCtVSavyd5N7hKryyVRaFDTE/edit?usp=sharing >>> >>> >>> Thank you for your consideration, >>> Russ >>> >> -- Ryan Blue Databricks