> The main downside with using the mask (or any solution based on a filter > node / filtering) is that it requires that the delete indices go into the > plan itself. So you need to first read the delete files and then create > the plan. And, if there are many deleted rows, this can be costly.
Ah, I see. I was assuming you could load the indices within the fragment scan, at the same time the page index was read. That's how I'm implementing with Lance, and how I plan to implement with Delta Lake. But if you can't do that, then filtering with an anti-join makes sense. You wouldn't want to include those in a plan. On Fri, Jun 2, 2023 at 7:38 AM Weston Pace <weston.p...@gmail.com> wrote: > Also, for clarity, I do agree with Gang that these are both valuable > features in their own right. A mask makes a lot of sense for page indices. > > On Fri, Jun 2, 2023 at 7:36 AM Weston Pace <weston.p...@gmail.com> wrote: > > > > then I think the incremental cost of adding the > > > positional deletes to the mask is probably lower than the anti-join. > > Do you mean developer cost? Then yes, I agree. Although there may be > > some subtlety in the pushdown to connect a dataset filter to a parquet > > reader filter. > > > > The main downside with using the mask (or any solution based on a filter > > node / filtering) is that it requires that the delete indices go into the > > plan itself. So you need to first read the delete files and then create > > the plan. And, if there are many deleted rows, this can be costly. > > > > On Thu, Jun 1, 2023 at 7:13 PM Will Jones <will.jones...@gmail.com> > wrote: > > > >> That's a good point, Gang. To perform deletes, we definitely need the > row > >> index, so we'll want that regardless of whether it's used in scans. > >> > >> > I'm not sure a mask would be the ideal solution for Iceberg (though it > >> is > >> a reasonable feature in its own right) because I think position-based > >> deletes, in Iceberg, are still done using an anti-join and not a filter. > >> > >> For just positional deletes in isolation, I agree the mask wouldn't be > >> more > >> optimal than the anti-join. But if they end up using the mask for > >> filtering > >> with the page index, then I think the incremental cost of adding the > >> positional deletes to the mask is probably lower than the anti-join. > >> > >> On Thu, Jun 1, 2023 at 6:33 PM Gang Wu <ust...@gmail.com> wrote: > >> > >> > IMO, the adding a row_index column from the reader is orthogonal to > >> > the mask implementation. Table formats (e.g. Apache Iceberg and > >> > Delta) require the knowledge of row index to finalize row deletion. It > >> > would be trivial to natively support row index from the file reader. > >> > > >> > Best, > >> > Gang > >> > > >> > On Fri, Jun 2, 2023 at 3:40 AM Weston Pace <weston.p...@gmail.com> > >> wrote: > >> > > >> > > I agree that having a row_index is a good approach. I'm not sure a > >> mask > >> > > would be the ideal solution for Iceberg (though it is a reasonable > >> > feature > >> > > in its own right) because I think position-based deletes, in > Iceberg, > >> are > >> > > still done using an anti-join and not a filter. > >> > > > >> > > That being said, we probably also want to implement a streaming > >> > merge-based > >> > > anti-join because I believe delete files are ordered by row_index > and > >> so > >> > a > >> > > streaming approach is likely to be much more performant. > >> > > > >> > > On Mon, May 29, 2023 at 4:01 PM Will Jones <will.jones...@gmail.com > > > >> > > wrote: > >> > > > >> > > > Hi Rusty, > >> > > > > >> > > > At first glance, I think adding a row_index column would make > >> sense. To > >> > > be > >> > > > clear, this would be an index within a file / fragment, not across > >> > > multiple > >> > > > files, which don't necessarily have a known ordering in Acero > >> (IIUC). > >> > > > > >> > > > However, another approach would be to take a mask argument in the > >> > Parquet > >> > > > reader. We may wish to do this anyways for support for using > >> predicate > >> > > > pushdown with Parquet's page index. While Arrow C++ hasn't yet > >> > > implemented > >> > > > predicate pushdown on page index (right now just supports row > >> groups), > >> > > > Arrow Rust has and provides an API to pass in a mask to support > it. > >> The > >> > > > reason for this implementation is described in the blog post > >> "Querying > >> > > > Parquet with Millisecond Latency" [1], under "Page Pruning". The > >> > > > RowSelection struct API is worth a look [2]. > >> > > > > >> > > > I'm not yet sure which would be preferable, but I think adopting a > >> > > similar > >> > > > pattern to what the Rust community has done may be wise. It's > >> possible > >> > > that > >> > > > row_index is easy to implement while the mask will take time, in > >> which > >> > > case > >> > > > row_index makes sense as an interim solution. > >> > > > > >> > > > Best, > >> > > > > >> > > > Will Jones > >> > > > > >> > > > [1] > >> > > > > >> > > > > >> > > > >> > > >> > https://arrow.apache.org/blog/2022/12/26/querying-parquet-with-millisecond-latency/ > >> > > > [2] > >> > > > > >> > > > > >> > > > >> > > >> > https://docs.rs/parquet/40.0.0/parquet/arrow/arrow_reader/struct.RowSelection.html > >> > > > > >> > > > On Mon, May 29, 2023 at 2:12 PM Rusty Conover > >> <ru...@conover.me.invalid > >> > > > >> > > > wrote: > >> > > > > >> > > > > Hi Arrow Team, > >> > > > > > >> > > > > I wanted to suggest an improvement regarding Acero's Scan node. > >> > > > > Currently, it provides useful information such as > >> __fragment_index, > >> > > > > __batch_index, __filename, and __last_in_fragment. However, it > >> would > >> > > > > be beneficial to have an additional column that returns an > overall > >> > > > > "row index" from the source. > >> > > > > > >> > > > > The row index would start from zero and increment for each row > >> > > > > retrieved from the source, particularly in the case of Parquet > >> files. > >> > > > > Is it currently possible to obtain this row index or would > >> expanding > >> > > > > the Scan node's behavior be required? > >> > > > > > >> > > > > Having this row index column would be valuable in implementing > >> > support > >> > > > > for Iceberg's positional-based delete files, as outlined in the > >> > > > > following link: > >> > > > > > >> > > > > https://iceberg.apache.org/spec/#delete-formats > >> > > > > > >> > > > > While Iceberg's value-based deletes can already be performed > using > >> > the > >> > > > > support for anti joins, using a projection node does not > guarantee > >> > the > >> > > > > row ordering within an Acero graph. Hence, the inclusion of a > >> > > > > dedicated row index column would provide a more reliable > solution > >> in > >> > > > > this context. > >> > > > > > >> > > > > Thank you for considering this suggestion. > >> > > > > > >> > > > > Rusty > >> > > > > > >> > > > > >> > > > >> > > >> > > >