Thank you for sharing, Wes, an interesting paper indeed. In Rust we currently use a different strategy.
We build an iterator over ranges [a_i, b_i[ to be selected from the filter bitmap, and filter the array based on those ranges. For a single filter, the ranges are iterated as they are being built; for multiple filters, we persist all ranges ([a_0, b_0[, [a_1, b_1[, ...), and then apply them to each array (potentially in parallel). The main advantages are: 1. it allows memcopying values buffers in slices instead of individual items, thereby reducing the total number of memcopies performed. 2. When the bitmap has no offsets, we efficiently skip chunks via a simple comparison (byte == 0) 3. it lends itself well to clustered data (i.e. when there are clusters of selections in the data) The best case scenario is both [0, 0, 0, 0] and [1, 1, 1, 1]; worst case scenario is when the selectivity is [0, 1, 0, 1, 0, 1], which is more expensive than "take"ing, as there are extra cycles in deriving the ranges. I do not have the numbers, but there was a performance improvement that motivated us to switch strategies. Besides the obvious memcopies, my understanding is that this is not so much dependent on the selectivity, but rather on the number of switches throughout the bitmap. Best, Jorge On Wed, Jun 23, 2021 at 2:05 AM Wes McKinney <wesmck...@gmail.com> wrote: > > Some on this list might be interested in a new paper out of CMU/MIT > about the use of selection vectors and bitmaps for handling the > intermediate results of filters: > > https://db.cs.cmu.edu/papers/2021/ngom-damon2021.pdf > > The research was done in the context of NoisePage which uses Arrow as > its memory format. I found some of the observations related to AVX512 > to be interesting.