Le 31/03/2022 à 09:19, Sasha Krassovsky a écrit :
As I showed, those auto-vectorized kernels may be vectorized only in some
situations, depending on the compiler version, the input datatypes...
I would more than anything interpret the fact that that code was vectorized at
all as an amazing win for compiler technology, as it’s a very abstract way of
gluing together different pieces of code using templates and lambda expressions.
That's a possible interpretation, but it doesn't really help the bottom
line :-)
A lot of the kernels that we would be writing are probably basic unit tests [1]
for the compiler’s vectorizer, and I’ve hopefully shown that even very old
versions do just fine.
Anyway, in the worst case we will eventually write every kernel with xsimd, and
have the autovectorized kernels temporarily there. If we find that performance
is good on our platforms, then we can skip the “rewrite in xsimd” step.
"Our platforms" are rather broad however. We have binary packages for
Windows, macOS, Linux, using several compilers and toolchains (because
there are R packages, Python packages and sometimes C++ packages). For
example, on Windows the R packages are built with different versions of
MinGW/gcc depending on the R version, while the Python packages are
built with some version of MSVC (which might be of a different version
depending on whether it's a conda package or a Python wheel, I'm not sure).
And there are of course the different architectures: we support x86 and
arm64 for both macOS and Linux, for example; we might even have ppc64
packages of some sort (?).
Regards
Antoine.