> As I showed, those auto-vectorized kernels may be vectorized only in some > situations, depending on the compiler version, the input datatypes...
I would more than anything interpret the fact that that code was vectorized at all as an amazing win for compiler technology, as it’s a very abstract way of gluing together different pieces of code using templates and lambda expressions. A lot of the kernels that we would be writing are probably basic unit tests [1] for the compiler’s vectorizer, and I’ve hopefully shown that even very old versions do just fine. Anyway, in the worst case we will eventually write every kernel with xsimd, and have the autovectorized kernels temporarily there. If we find that performance is good on our platforms, then we can skip the “rewrite in xsimd” step. Sasha [1] https://github.com/llvm/llvm-project/blob/main/llvm/test/Transforms/LoopVectorize/gcc-examples.ll > 30 марта 2022 г., в 23:38, Antoine Pitrou <anto...@python.org> написал(а): > > As I showed, those auto-vectorized kernels may be vectorized only in some > situations, depending on the compiler version, the input datatypes...