https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908
--- Comment #22 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Hongtao.liu from comment #21) > Now we have SLP node available in vector cost hook, maybe we can do sth in > cost model to prevent vectorization when node's definition from big-size > parameter. Note we vectorize a load here for which we do not pass down an SLP node. But of course there's the stmt-info one could look at - but the issue is that for SLP that doesn't tell you which part of the variable is accessed. Also even if we were to pass down the SLP node we do not know exactly how it is going to vectorize - but sure, we could play with some heuristics there. For x86 we can just assume that all aggregates > 16 bytes are passed on the stack, correct? Note I see for #include <stdlib.h> struct X { double x[3]; }; typedef double v2df __attribute__((vector_size(16))); v2df __attribute__((noipa)) foo (struct X x, struct X y) { return (v2df) {x.x[1], x.x[2] } + (v2df) { y.x[0], y.x[1] }; } struct X y; int main(int argc, char **argv) { struct X x = y; int cnt = atoi (argv[1]); for (int i = 0; i < cnt; ++i) foo (x, x); return 0; } the structs passed as movups %xmm0, 24(%rsp) movq %rax, 40(%rsp) movq %rax, 16(%rsp) movups %xmm0, (%rsp) call foo so alignment of the stack variable depends on the position of the function argument (and thus preceeding parameters). That means we cannot rely on &y being 16 byte aligned and it seems we cannot rely on a particular store sequence order either here. That would mean pessimization of all incoming stack parameters > 16 bytes in size (maybe also == 16 bytes?) because we do not know how the caller pushed the parameters? (without the caller using %xmm stores all such vectorization would trigger STLF failures - dependent on the load-to-store "distance" of course). Can you peek engineers at Intel at what a big enough "distance" would be to make sure the store hit L1 (and is a load from L1 better than a failed STLF, thus the store still in buffers but not forwardable)?