Hi Peter, FWIW, Trino Iceberg connector writes deletion files with just positions, without row data. cc @Alexander Jo <alex...@starburstdata.com>
> For the 1st point we just need to collect the statistics during the delete, but we do not have to actually persist the data. I would be weary of creating ORC/Parquet files with statistics that do not match actual file contents. > Do I miss something? Is there a use-case when using positional deletes with row values is significantly more effective? I recall some mention of CDC use-case -- producing CDC events from changes to a table. But I think I recall someone mentioning this usually ends up needing to join with actual data files anyway. @Ryan Blue <b...@tabular.io> will know better, but in the meantime you can probably also dig the topic up in the mailing list. Best PF On Thu, May 5, 2022 at 3:59 PM Peter Vary <pv...@cloudera.com.invalid> wrote: > Hi Team, > > We are working on integrating Iceberg V2 tables with Hive, and enabling > delete and update operations. > The delete is implemented by Marton and the first version is already > merged: https://issues.apache.org/jira/browse/HIVE-26102 > The update statement is still in progress: > https://issues.apache.org/jira/browse/HIVE-26136 > The edges are a bit rough for the time being, so don’t use this in > production :D > > During the implementation we found that implementing deletes was quite > straightforward with the Iceberg positional deletes, and without much > effort we were able to provide the row values too. OTOH for updates we need > to sort the delete files and the data files differently. ATM we have only a > single result table, so we ended up implementing our own writer which is > very similar to > https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/io/SortedPosDeleteWriter.java > to > do the sorting of the delete records for us. The problem with the > SortedPosDeleteWriter is that when the record size grows then the number of > records we can keep in memory decreases. So we ended up with our own writer > which stores only the minimal things in memory and writes only positional > deletes without the actual row values. See: > https://github.com/apache/hive/blob/master/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergBufferedDeleteWriter.java > > The question is: > - What is the experience of the community? When it is beneficial to have > the row values in the positional delete files in production? > > My feeling is: > > 1. The row data is best used when there is a filter in the query and > we can filter out whole delete files when running the query. > 2. There could be a slight improvement when we can skip > RowGroups/Stripes based on the filter > > > For the 1st point we just need to collect the statistics during the > delete, but we do not have to actually persist the data. > Would it be viable to create these delete files where the statistics could > not be calculated directly from the files themselves? > Would the community accept these files? > > OTOH we have significant downsides for positional deletes with row values: > > 1. The delete file size increases significantly > 2. We should keep smaller delete row RowGroup/Stripe size to > accommodate the bigger amount of raw data - so we have to read more footers > and adding IO overhead > > > So my feeling is that generally speaking positional deletes without the > actual row data would be more performant the positional deletes with row > data. > > Do I miss something? Is there a use-case when using positional deletes > with row values is significantly more effective? > > Thanks, > Peter > >