[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

crazylht at gmail dot com via Gcc-bugs Sun, 27 Feb 2022 17:29:12 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908


--- Comment #26 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #22)
> (In reply to Hongtao.liu from comment #21)
> > Now we have SLP node available in vector cost hook, maybe we can do sth in
> > cost model to prevent vectorization when node's definition from big-size
> > parameter.
> 
> Note we vectorize a load here for which we do not pass down an SLP node.
> But of course there's the stmt-info one could look at - but the issue
> is that for SLP that doesn't tell you which part of the variable is accessed.
> Also even if we were to pass down the SLP node we do not know exactly how
> it is going to vectorize - but sure, we could play with some heuristics
> there.
> 
> For x86 we can just assume that all aggregates > 16 bytes are passed on the
> stack, correct?  Note I see for
> 
> #include <stdlib.h>
> 
> struct X { double x[3]; };
> typedef double v2df __attribute__((vector_size(16)));
> 
> v2df __attribute__((noipa))
> foo (struct X x, struct X y)
> {
>   return (v2df) {x.x[1], x.x[2] } + (v2df) { y.x[0], y.x[1] };
> }
> 
> struct X y;
> int main(int argc, char **argv)
> {
>   struct X x = y;
>   int cnt = atoi (argv[1]);
>   for (int i = 0; i < cnt; ++i)
>     foo (x, x);
>   return 0;
> }
> 
> the structs passed as
> 
>         movups  %xmm0, 24(%rsp)
>         movq    %rax, 40(%rsp)
>         movq    %rax, 16(%rsp)
>         movups  %xmm0, (%rsp)
>         call    foo
> 
> so alignment of the stack variable depends on the position of the
> function argument (and thus preceeding parameters).  That means
> we cannot rely on &y being 16 byte aligned and it seems we cannot
> rely on a particular store sequence order either here.
We can start with disabling vectorization with very cheap cost model to fix O2
regressions, then fine tune that in GCC 13.

[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

Reply via email to