Hi Sasha, Could you elaborate on the problems of the XSIMD dependency? What you describe sounds a lot like what XSIMD provides in a prepackaged form and without the extra CMake magic.
I have to occasionally build Arrow with an external build system and it sounds like this type of logic could add complexity there. Thanks, Micah On Tue, Mar 29, 2022 at 3:14 PM Sasha Krassovsky <krassovskysa...@gmail.com> wrote: > Hi everyone, > I've noticed that we include xsimd as an abstraction over all of the simd > architectures. I'd like to propose a different solution which would result > in fewer lines of code, while being more readable. > > My thinking is that anything simple enough to abstract with xsimd can be > autovectorized by the compiler. Any more interesting SIMD algorithm usually > is tailored to the target instruction set and can't be abstracted away with > xsimd anyway. > > With that in mind, I'd like to propose the following strategy: > 1. Write a single source file with simple, element-at-a-time for loop > implementations of each function. > 2. Compile this same source file several times with different compile flags > for different vectorization (e.g. if we're on an x86 machine that supports > AVX2 and AVX512, we'd compile once with -mavx2 and once with -mavx512vl). > 3. Functions compiled with different instruction sets can be differentiated > by a namespace, which gets defined during the compiler invocation. For > example, for AVX2 we'd invoke the compiler with -DNAMESPACE=AVX2 and then > for something like elementwise addition of two arrays, we'd call > arrow::AVX2::VectorAdd. > > I believe this would let us remove xsimd as a dependency while also giving > us lots of vectorized kernels at the cost of some extra cmake magic. After > that, it would just be a matter of making the function registry point to > these new functions. > > Please let me know your thoughts! > > Thanks, > Sasha Krassovsky >