https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908

--- Comment #26 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #22)
> (In reply to Hongtao.liu from comment #21)
> > Now we have SLP node available in vector cost hook, maybe we can do sth in
> > cost model to prevent vectorization when node's definition from big-size
> > parameter.
> 
> Note we vectorize a load here for which we do not pass down an SLP node.
> But of course there's the stmt-info one could look at - but the issue
> is that for SLP that doesn't tell you which part of the variable is accessed.
> Also even if we were to pass down the SLP node we do not know exactly how
> it is going to vectorize - but sure, we could play with some heuristics
> there.
> 
> For x86 we can just assume that all aggregates > 16 bytes are passed on the
> stack, correct?  Note I see for
> 
> #include <stdlib.h>
> 
> struct X { double x[3]; };
> typedef double v2df __attribute__((vector_size(16)));
> 
> v2df __attribute__((noipa))
> foo (struct X x, struct X y)
> {
>   return (v2df) {x.x[1], x.x[2] } + (v2df) { y.x[0], y.x[1] };
> }
> 
> struct X y;
> int main(int argc, char **argv)
> {
>   struct X x = y;
>   int cnt = atoi (argv[1]);
>   for (int i = 0; i < cnt; ++i)
>     foo (x, x);
>   return 0;
> }
> 
> the structs passed as
> 
>         movups  %xmm0, 24(%rsp)
>         movq    %rax, 40(%rsp)
>         movq    %rax, 16(%rsp)
>         movups  %xmm0, (%rsp)
>         call    foo
> 
> so alignment of the stack variable depends on the position of the
> function argument (and thus preceeding parameters).  That means
> we cannot rely on &y being 16 byte aligned and it seems we cannot
> rely on a particular store sequence order either here.
We can start with disabling vectorization with very cheap cost model to fix O2
regressions, then fine tune that in GCC 13.

Reply via email to