Hi, For following test-case, taken from pr88152.C: #include <x86intrin.h>
template <typename T, size_t N> using V [[gnu::vector_size(N)]] = T; int f10 (V<unsigned long long, 16> a) { return _mm_movemask_pd (reinterpret_cast<__m128d> (a > __LONG_LONG_MAX__)); } .optimized dump shows: f10 (V a) { vector(2) signed long _1; vector(2) long int _2; vector(2) double _3; int _6; <bb 2> [local count: 1073741824]: _1 = VIEW_CONVERT_EXPR<vector(2) signed long>(a_4(D)); _2 = VEC_COND_EXPR <_1 < { 0, 0 }, { -1, -1 }, { 0, 0 }>; _3 = VIEW_CONVERT_EXPR<__m128d>(_2); _6 = __builtin_ia32_movmskpd (_3); [tail call] return _6; } IIUC, we're using -1 to represent true and 0 as false. combine then does following combinations: Trying 7 -> 9: 7: r90:V2DI=r89:V2DI>r93:V2DI REG_DEAD r93:V2DI REG_DEAD r89:V2DI 9: r91:V2DF=r90:V2DI#0 REG_DEAD r90:V2DI Successfully matched this instruction: (set (subreg:V2DI (reg:V2DF 91) 0) (gt:V2DI (reg:V2DI 89) (reg:V2DI 93))) allowing combination of insns 7 and 9 Trying 6, 9 -> 10: 6: r89:V2DI=const_vector 9: r91:V2DF#0=r89:V2DI>r93:V2DI REG_DEAD r89:V2DI REG_DEAD r93:V2DI 10: r87:SI=unspec[r91:V2DF] 43 REG_DEAD r91:V2DF Successfully matched this instruction: (set (reg:SI 87) (unspec:SI [ (lt:V2DF (reg:V2DI 93) (const_vector:V2DI [ (const_int 0 [0]) repeated x2 ])) ] UNSPEC_MOVMSK)) Is the above folding correct, since lt has V2DF mode, and casting -1 (literally) to DFmode would result in -NaN ? Also, should result of lt be having only integral modes ? split2 then folds insn 10 into: (insn 22 9 16 2 (set (reg:SI 0 ax [87]) (unspec:SI [ (reg:V2DF 20 xmm0 [93]) ] UNSPEC_MOVMSK)) "../../stage1-build/gcc/include/emmintrin.h":958:34 4222 {sse2_movmskpd} (nil)) deleting insn 10. The issue is my patch for PR88833 results in following propagation in forwprop1: In insn 10, replacing (unspec:SI [ (reg:V2DF 91) ] UNSPEC_MOVMSK) with (unspec:SI [ (subreg:V2DF (reg:V2DI 90) 0) ] UNSPEC_MOVMSK) deleting insn 9 and this inhibits the above combinations, resulting in failure of PR88152.C With patch, combine shows: Trying 7 -> 10: 7: r90:V2DI=r89:V2DI>r93:V2DI REG_DEAD r93:V2DI REG_DEAD r89:V2DI 10: r87:SI=unspec[r90:V2DI#0] 43 REG_DEAD r90:V2DI Failed to match this instruction: (set (reg:SI 87) (unspec:SI [ (subreg:V2DF (gt:V2DI (reg:V2DI 89) (reg:V2DI 93)) 0) ] UNSPEC_MOVMSK)) and subsequently fails to match 6, 7 -> 10 Patch: http://people.linaro.org/~prathamesh.kulkarni/pr88833-10.diff Upstream discussion about the issue: https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01651.html Thanks, Prathamesh