This patch adds an vectorized implementation of the mersenne twister random number generator. This implementation is approximately 2.6 times faster than the non-vectorized implementation.
This implementation includes "arm_neon.h" when including the optimized <ext/random>. This has the effect of polluting the global namespace with the Neon intrinsics, so user macros and functions could potentially clash with them. Is this acceptable given this only happens when <ext/random> is explicitly included? Comments and input are welcome. Sample code to use the new generator would look like this: #include <random> #include <ext/random> #include <iostream> int main() { __gnu_cxx::sfmt19937 mt(1729); std::uniform_int_distribution<int> dist(0,1008); for (int i = 0; i < 16; ++i) { std::cout << dist(mt) << " "; } } 2017-06-01 Michael Collison <michael.colli...@arm.com> Add optimized implementation of mersenne twister for aarch64 * config/cpu/aarch64/opt/ext/opt_random.h: New file. (__arch64_recursion): new function. (operator==): New function. (simd_fast_mersenne_twister_engine): New template class. * config/cpu/aarch64/opt/bits/opt_random.h: New file. * include/ext/random (add include for arm_neon.h): (simd_fast_mersenne_twister_engine): add _M_state private array for ARM_NEON conditional compilation.
gnutools-4218-v10.patch
Description: gnutools-4218-v10.patch