https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92651

--- Comment #11 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to rguent...@suse.de from comment #8)
> Sure.  Another option would be to enhance STV even further
> (or add some peephole patterns - combine runs before STV2) to
> transform the
> 
>         psubd   xmm3, xmm0
>         psubd   xmm0, xmm1
>         pmaxsd  xmm0, xmm3
> 
> into
> 
>         psubd  %xmm3, %xmm0
>         pabsd  %xmm0, %xmm0
> 
> for enhancing STV that means adding abs() patterns (or adding
> combine-like matching to the pass which I'd suggest not do).


> 
> Clearly that the above conversion isn't done is a generic
> missed optimization.  Maybe you can benchmark that as well
> though I guess it won't come near the xor variant?

After PR97873 introduced abs to STV:

int test (int x, int y)
{
  return abs (x - y);
}

cc1 -O2 -msse4.1 -mstv -m32:

        movd    4(%esp), %xmm0
        movd    8(%esp), %xmm1
        psubd   %xmm1, %xmm0
        pabsd   %xmm0, %xmm0
        movd    %xmm0, %eax

Reply via email to