https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117875
--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> --- Remains fast_algorithms.c:133:19: optimized: Loop 3 distributed: split to 3 loops and 0 library calls. fast_algorithms.c:133:19: optimized: Loop 5 distributed: split to 2 loops and 0 library calls. fast_algorithms.c:133:19: optimized: loop vectorized using 32 byte vectors fast_algorithms.c:133:19: optimized: loop versioned for vectorization because of possible aliasing fast_algorithms.c:133:19: optimized: loop vectorized using 32 byte vectors fast_algorithms.c:133:19: optimized: loop versioned for vectorization because of possible aliasing -fast_algorithms.c:133:19: optimized: loop vectorized using 32 byte vectors -fast_algorithms.c:133:19: optimized: loop versioned for vectorization because of possible aliasing specifically fast_algorithms.c:133:19: note: using as main loop exit: 58 -> 63 [AUX: (nil)] fast_algorithms.c:133:19: note: LOOP VECTORIZED is no longer vectorized. We have fast_algorithms.c:133:19: note: Build SLP for _ifc__350 = _1750; fast_algorithms.c:133:19: missed: Build SLP failed: operation unsupported _ifc__350 = _1750; which came up before. This is emitted from if-conversion: # DEBUG BEGIN_STMT - if (_1359 < -987654321) - goto <bb 62>; [50.00%] - else - goto <bb 61>; [50.00%] - - <bb 59> [local count: 489894279]: - goto <bb 58>; [100.00%] - - <bb 60> [local count: 550443010]: + _227 = _1359 < -987654321; + # DEBUG BEGIN_STMT + _ifc__347 = _227 ? -987654321 : _1355; + _1750 = MAX_EXPR <_1359, -987654321>; + _ifc__350 = _1750; + *_1347 = _ifc__350; # DEBUG BEGIN_STMT k_1361 = k_1291 + 1; # DEBUG k => k_1361 @@ -2619,14 +2683,77 @@ else goto <bb 63>; [11.00%] - <bb 61> [local count: 374301246]: - *_1347 = _1359; - goto <bb 60>; [100.00%] + <bb 59> [local count: 489894279]: specifically from VN run on the body: Value numbering stmt = _ifc__350 = _1287 ? _ifc__348 : _ifc__349; Applying pattern match.pd:6569, gimple-match-3.cc:47593 Setting value number of _ifc__350 to _ifc__350 (changed) Applying pattern match.pd:5271, gimple-match-4.cc:7879 Applying pattern match.pd:6365, gimple-match-5.cc:6177 Applying pattern match.pd:6569, gimple-match-3.cc:47593 gimple_simplified to _1750 = MAX_EXPR <_1359, -987654321>; _ifc__350 = _1750; Making available beyond BB58 _ifc__350 for value _ifc__350 where (cond (cmp:c (nop_convert1?@c0 @0) (nop_convert2?@c1 @1)) (convert3? @0) (convert4? @1)) is simplified as (convert (max @c0 @c1))))) and the conversion turns out unnecessary. There's a longer standing issue with match producing this, we generate { res_op->set_op (NOP_EXPR, type, 1); { tree _o1[2], _r1; _o1[0] = captures[0]; _o1[1] = captures[2]; gimple_match_op tem_op (res_op->cond.any_else (), MAX_EXPR, TREE_TYPE (_o1[0]), _o1[0], _o1[1]); tem_op.resimplify (lseq, valueize); _r1 = maybe_push_res_to_seq (&tem_op, lseq); if (!_r1) goto next_after_fail1387; res_op->ops[0] = _r1; } res_op->resimplify (lseq, valueize); so we push the inner expression without considering the outer conversion. The other thing is that VN elimination folds stmts we substitute into but even in non-iterating mode we do not value-number all resulting stmts, we merely assign value-numbers to resulting defs. So those copies prevail. I've thought we should fix it elsewhere than allowing SSA copies in SLP build, but this shows it will be the easiest fix given non-SLP handles this as vector operation just fine. So I'll go for this, filing separate issues for the two above.