https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Blocks| |53947 CC| |uros at gcc dot gnu.org --- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Joel Yliluoma from comment #11) > Looks like this issue has taken a step or two *backwards* in the past years. > > Where as the second function used to be vectorized properly, today it seems > neither of them are. Which version do you see vectorizing the second (add2) function? > Contrast this with Clang, which compiles *both* functions into a single > instruction: > > vaddps xmm0, xmm1, xmm0 > > or some variant thereof depending on the -m options. > > Compiler Explorer link: https://godbolt.org/z/2AKhnt The main issues on the GCC side are a) ABI details not exposed at the point of vectorization (several PRs about this exist) b) "Poor" support for two-element float vectors (an understatement, we have some support for MMX but that's integer only, but I'm not sure we've enabled the 3dnow part to be emulated with SSE) oddly enough even with -mmmx -m3dnow I see add2 lowered by veclower so the vector type or the vector add must be unsupported(?). llvm is known to support emulating smaller vectors just fine (and by design is also aware of ABI details). Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations