https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728
--- Comment #5 from Martin Reinecke <mar...@mpa-garching.mpg.de> --- (In reply to Matthias Kretz (Vir) from comment #4) > FWIW, using std::experimental::native_simd<double> also does not hoist the > stores out of the loop. However, if you pass d by value and return d, the > issue goes away. So I guess this is an aliasing pessimization. This is an interesting data point ... In my first test case (attached to https://gcc.gnu.org/pipermail/gcc-help/2021-March/139976.html), I explicitly make a local copy of d and copy back at the end of the function, and this didn't help. Strange. > Even though > you added __restrict__. In any case __m256 has the problem that it is > declared with the may_alias attribute. I recommend to just never use __m256 > unless you have no other choice. I guess I need it for unaligned loads/stores, correct? Otherwise __v4df should work everywhere.