https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116787
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Uroš Bizjak from comment #6) > (In reply to Richard Biener from comment #0) > > typedef float v4sf __attribute__((vector_size (sizeof (4 * sizeof > > (float))))); > > > > v4sf > > foo (v4sf x, v4sf y) > > { > > return x < y ? y : x; > > } > > > > is no longer generating > > > > _Z3fooDv2_fS_: > > .LFB0: > > .cfi_startproc > > maxps %xmm0, %xmm1 > > movaps %xmm1, %xmm0 > > ret > > > > with -O2, neither with -O2 -msse4.2 > > It does with -ffast-math. I think the issue is that we pad out the compare RTXen with zeros but not the RHS when they are equal to the compare operands. We then later are not able to combine to maxps. Something like diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 9a8d6030d8b..8b846306fd6 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1184,6 +1184,14 @@ emit_insn (gen_movq_v2sf_to_sse (ops[5], operands[5])); emit_insn (gen_movq_v2sf_to_sse (ops[4], operands[4])); + if (rtx_equal_p (operands[5], operands[2])) + ops[2] = ops[5]; + else if (rtx_equal_p (operands[5], operands[1])) + ops[1] = ops[5]; + if (rtx_equal_p (operands[4], operands[1])) + ops[1] = ops[4]; + else if (rtx_equal_p (operands[4], operands[2])) + ops[2] = ops[4]; bool ok = ix86_expand_fp_vcond (ops); gcc_assert (ok); generates the expected _Z3fooDv2_fS_: .LFB0: .cfi_startproc movq %xmm0, %xmm0 movq %xmm1, %xmm1 maxps %xmm0, %xmm1 movaps %xmm1, %xmm0 ret