https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86504
--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Joel Hutton from comment #8) > (In reply to Richard Biener from comment #3) > > Hi Richard, > > > So the vectorization issue would be that basic-block vectorization doesn't > > catch this in a very nice way - on x86 we pull out the invariant computation > > and have a vectorized (outer) loop storing to d. > > Just a small clarification, do you mean to say that there is a difference > between the way x86 and aarch64 handle this, as far as I can see they handle > this in the same way. No, I was just mentioning I inspected this on x86 because I do expect the handling to be the same.