https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115683
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- For gcc.target/i386/pr88540.c we expand the mask producer as (insn 12 11 13 (set (reg:V2DF 109) (lt:V2DF (reg:V2DF 101 [ vect__23.6 ]) (reg:V2DF 98 [ vect__25.9 ]))) -1 (nil)) (insn 13 12 14 (set (reg:V2DI 108) (subreg:V2DI (reg:V2DF 109) 0)) -1 (nil)) (insn 14 13 15 (set (reg:V2DI 107 [ mask__26.10_21 ]) (reg:V2DI 108)) -1 (nil)) I think that we go though a named expander for the vec_cmp means we cannot use TER tricks like we do with the scalar expansion which produces the min from the x86 expander directly. combine sees 12: r109:V2DF=r105:V2DF<r106:V2DF 15: r110:V2DF=r105:V2DF&r109:V2DF REG_DEAD r105:V2DF 16: r111:V2DF=~r109:V2DF&r106:V2DF REG_DEAD r109:V2DF REG_DEAD r106:V2DF 17: r100:V2DF=r111:V2DF|r110:V2DF it tries 12, 15 -> 17 and 16, 15 -> 17 but I think the four-insn combinations do not include this "diamond" variant. It has I0, I1 -> I2, I2 -> I3 but this would be I0 -> I1, I0 -> I2, (I1, I2) -> I3, not sure if it were to do that at all if we pass the insns to try_combine. I don't see a good way for combine helpers, the only option would have been to keep the blend (15, 16, 17) in a single insn to be split only after combine. With SSE 4.1 this is what happens (UNSPEC_BLENDV). And of course catching this min/max form with a new optab during ISEL or to be emitted (and costed) by the vectorizer directly. It would be quite special, select_lt and select_gt maybe, eventually merged select with a compare op like we have for vec_cmp to specify the comparison code. select (A code B) would then be A code B ? A : B.