[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

crazylht at gmail dot com via Gcc-bugs Fri, 27 Aug 2021 00:02:48 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167


--- Comment #18 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Andrew Pinski from comment #17)
> (In reply to Hongtao.liu from comment #16)
> > typedef int v4si __attribute__ ((vector_size(16)));
> > 
> > v4si f(v4si a, v4si b) {
> >     v4si a1 = __builtin_shufflevector (a, a, 2, 3 ,1 ,0);
> >     v4si b1 = __builtin_shufflevector (b, a, 2, 3 ,1 ,0);
> >     return a1 * b1;
> > }
> > 
> > gcc generate 
> > 
> > f:
> >         vpshufd xmm1, xmm1, 30
> >         vpshufd xmm0, xmm0, 30
> >         vpmulld xmm0, xmm0, xmm1
> >         ret
> > 
> > llvm generate
> > 
> > f:                                      # @f
> >         vpmulld xmm0, xmm1, xmm0
> >         vpshufd xmm0, xmm0, 30                  # xmm0 = xmm0[2,3,1,0]
> >         ret
> 
> For the above, this is safe for -ftrapping-math as all elements are still
> used.  It is when elements that are not used it might not be safe ...

For vector integers it should be ok?
For vector floating point we can add condition flag_unsafe_math_optimizations
|| !flag_trapping_math for the optimization.

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

Reply via email to