https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908
--- Comment #26 from Hongtao.liu <crazylht at gmail dot com> --- (In reply to Richard Biener from comment #22) > (In reply to Hongtao.liu from comment #21) > > Now we have SLP node available in vector cost hook, maybe we can do sth in > > cost model to prevent vectorization when node's definition from big-size > > parameter. > > Note we vectorize a load here for which we do not pass down an SLP node. > But of course there's the stmt-info one could look at - but the issue > is that for SLP that doesn't tell you which part of the variable is accessed. > Also even if we were to pass down the SLP node we do not know exactly how > it is going to vectorize - but sure, we could play with some heuristics > there. > > For x86 we can just assume that all aggregates > 16 bytes are passed on the > stack, correct? Note I see for > > #include <stdlib.h> > > struct X { double x[3]; }; > typedef double v2df __attribute__((vector_size(16))); > > v2df __attribute__((noipa)) > foo (struct X x, struct X y) > { > return (v2df) {x.x[1], x.x[2] } + (v2df) { y.x[0], y.x[1] }; > } > > struct X y; > int main(int argc, char **argv) > { > struct X x = y; > int cnt = atoi (argv[1]); > for (int i = 0; i < cnt; ++i) > foo (x, x); > return 0; > } > > the structs passed as > > movups %xmm0, 24(%rsp) > movq %rax, 40(%rsp) > movq %rax, 16(%rsp) > movups %xmm0, (%rsp) > call foo > > so alignment of the stack variable depends on the position of the > function argument (and thus preceeding parameters). That means > we cannot rely on &y being 16 byte aligned and it seems we cannot > rely on a particular store sequence order either here. We can start with disabling vectorization with very cheap cost model to fix O2 regressions, then fine tune that in GCC 13.