https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88713
--- Comment #26 from Chris Elrod <elrodc at gmail dot com> --- > You can try enabling -mrecip to see RSQRT in .optimized - there's > probably late 1/sqrt optimization on RTL. No luck. The full commands I used: gfortran -Ofast -mrecip -S -fdump-tree-optimized -march=native -shared -fPIC -mprefer-vector-width=512 -fno-semantic-interposition -o gfortvectorizationdump.s vectorization_test.f90 g++ -mrecip -Ofast -fdump-tree-optimized -S -march=native -shared -fPIC -mprefer-vector-width=512 -fno-semantic-interposition -o gppvectorization_test.s vectorization_test.cpp g++'s output was similar: vect_U33_60.31_372 = SQRT (vect_S33_59.30_371); vect_Ui33_61.32_374 = { 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0, 1.0e+0 } / vect_U33_60.31_372; vect_U13_62.33_375 = vect_S13_47.24_359 * vect_Ui33_61.32_374; vect_U23_63.34_376 = vect_S23_53.27_365 * vect_Ui33_61.32_374; and it has the same assembly as gfortran for the rsqrt: vcmpps $4, %zmm0, %zmm5, %k1 vrsqrt14ps %zmm0, %zmm1{%k1}{z} vmulps %zmm0, %zmm1, %zmm2 vmulps %zmm1, %zmm2, %zmm0 vmulps %zmm6, %zmm2, %zmm2 vaddps %zmm7, %zmm0, %zmm0 vmulps %zmm2, %zmm0, %zmm0 vrcp14ps %zmm0, %zmm10 vmulps %zmm0, %zmm10, %zmm0 vmulps %zmm0, %zmm10, %zmm0 vaddps %zmm10, %zmm10, %zmm10 vsubps %zmm0, %zmm10, %zmm10