Hi,

I totally agree that arrow should has a built-in support for runtime 
dispatching facilities just like other popular computing libs to fully utilize 
the modern hardware capacity, we feel arrow has great potential performance 
chance with the advanced cpu SIMD feature. 

It's ok for me to stop the current SIMD PR, only concern is how long a basic 
runtime policy can be ready to leverage? Dose the kernel refactoring include a 
runtime dispatching already?

Thanks,
Frank

-----Original Message-----
From: Wes McKinney <wesmck...@gmail.com> 
Sent: Wednesday, May 13, 2020 9:46 AM
To: dev <dev@arrow.apache.org>
Subject: [C++] Runtime SIMD dispatching for Arrow

hi,

We've started to receive a number of patches providing SIMD operations for both 
x86 and ARM architectures. Most of these patches make use of compiler 
definitions to toggle between code paths at compile time.

This is problematic for a few reasons:

* Binaries that are shipped (e.g. in Python) must generally be compiled for a 
broad set of supported compilers. That means that AVX2 / AVX512 optimizations 
won't be available in these builds for processors that have them
* Poses a maintainability and testing problem (hard to test every combination, 
and it is not practical for local development to compile every combination, 
which may cause drawn out test/CI/fix cycles)

Other projects (e.g. NumPy) have taken the approach of building binaries that 
contain multiple variants of a function with different levels of SIMD, and then 
choosing at runtime which one to execute based on what features the CPU 
supports. This seems like what we ultimately need to do in Apache Arrow, and if 
we continue to accept patches that do not do this, it will be much more work 
later when we have to refactor things to runtime dispatching.

We have some PRs in the queue related to SIMD. Without taking a heavy handed 
approach like starting to veto PRs, how would everyone like to begin to address 
the runtime dispatching problem?

Note that the Kernels revamp project I am working on right now will also 
facilitate runtime SIMD kernel dispatching for array expression evaluation.

Thanks,
Wes

Reply via email to