It should be possible to unroll the sentinel version in many cases. For
instance,

     sum += (data[i] == SENTINEL) * data[i]

This doesn't work with NaN as a sentinel because 0 * NaN => NaN, but it can
work with other values.


On Tue, Oct 16, 2018 at 9:38 AM Antoine Pitrou <anto...@python.org> wrote:

>
> Hi Wes,
>
> Le 16/10/2018 à 14:05, Wes McKinney a écrit :
> > hi folks,
> >
> > I explored a bit the performance implications of using validity
> > bitmaps (like the Arrow columnar format) vs. sentinel values (like
> > NaN, INT32_MIN) for nulls:
> >
> > http://wesmckinney.com/blog/bitmaps-vs-sentinel-values/
> >
> > The vectorization results may be of interest to those implementing
> > analytic functions targeting the Arrow memory format. There's probably
> > some other optimizations that can be employed, too.
>
> This is a nice write-up.  It may also possible to further speed up
> things using explicit SIMD operations.
>
> For the non-null case, it should be relatively doable, see e.g.
> https://p12tic.github.io/libsimdpp/v2.2-dev/libsimdpp/w/int/reduce_add.html
> or
> https://p12tic.github.io/libsimdpp/v2.2-dev/libsimdpp/w/fp/reduce_add.html
> .
>
> For the with-nulls case, it might be possible to do something with SIMD
> masks, but I'm not competent to propose anything concrete :-)
>
> Regards
>
> Antoine.
>
>
> >
> > Caveat: it's entirely possible I made some mistakes in my code. I
> > checked the various implementations for correctness only, and did not
> > dig too deeply beyond that.
> >
> > - Wes
> >
>

Reply via email to