https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112092
--- Comment #16 from Maciej W. Rozycki <macro at orcam dot me.uk> --- As I say GCC doesn't know the inline asm makes use of the vector unit, so the compiler is free to make any optimisations that it can see fit based on vector code it has produced itself. Actually in this case there is no (visible) vector unit use in this function, so code for the intrinsic has been only artificially retained due to the use of `volatile' keyword. So I repeat: does the problem persist with the inline asm corrected?