https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116787

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #6)
> (In reply to Richard Biener from comment #0)
> > typedef float v4sf __attribute__((vector_size (sizeof (4 * sizeof
> > (float)))));
> > 
> > v4sf
> > foo (v4sf x, v4sf y)
> > {
> >   return x < y ? y : x;
> > }
> > 
> > is no longer generating
> > 
> > _Z3fooDv2_fS_:
> > .LFB0:
> >         .cfi_startproc
> >         maxps   %xmm0, %xmm1
> >         movaps  %xmm1, %xmm0
> >         ret
> > 
> > with -O2, neither with -O2 -msse4.2
> 
> It does with -ffast-math.

I think the issue is that we pad out the compare RTXen with zeros but
not the RHS when they are equal to the compare operands.  We then later
are not able to combine to maxps.

Something like

diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 9a8d6030d8b..8b846306fd6 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1184,6 +1184,14 @@

   emit_insn (gen_movq_v2sf_to_sse (ops[5], operands[5]));
   emit_insn (gen_movq_v2sf_to_sse (ops[4], operands[4]));
+  if (rtx_equal_p (operands[5], operands[2]))
+    ops[2] = ops[5];
+  else if (rtx_equal_p (operands[5], operands[1]))
+    ops[1] = ops[5];
+  if (rtx_equal_p (operands[4], operands[1]))
+    ops[1] = ops[4];
+  else if (rtx_equal_p (operands[4], operands[2]))
+    ops[2] = ops[4];

   bool ok = ix86_expand_fp_vcond (ops);
   gcc_assert (ok);

generates the expected

_Z3fooDv2_fS_:
.LFB0:
        .cfi_startproc
        movq    %xmm0, %xmm0
        movq    %xmm1, %xmm1
        maxps   %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        ret

Reply via email to