https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797
--- Comment #11 from Jan Hubicka <hubicka at gcc dot gnu.org> --- Aha, I did not noticed that we need special patterns (I extecpted this is problem to solve in machine independent code). So I guess we have 1) SLP should vectorize the 3 accesses with -ffast-math to only one vector operation (as opposed to one vector+one scalar it does now) 2) we could adddivv2sf3 pattern which initializes the elt 4 of the operand2 to 1.0f to avoid funny results 3) we need to figure out why SLP vectorization is not even considered in the original testcase (which I do not seem to be able to dig out with reasonable effort in a way that it preserves original properties - to be vectorized by clang and not vectorized by gcc)