https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108360
--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> --- I now see the following IL into the last VRP pass <bb 2> [local count: 1073741824]: b.2_1 = b; _2 = b.2_1 <= 0; h.0_19 = (unsigned short) _2; _20 = h.0_19 + 65535; _21 = (short int) _20; _3 = _21 >= 0; _4 = (char) _3; f = _4; f.5_5 = (unsigned char) _3; _6 = f.5_5 << 4; e = _6; _23 = (short int) _3; _26 = _23 << 4; _8 = b.2_1 == _26; [0,1] h_15 = (short int) _8; [0,1] _22 = h_15 + -1; [-1,0] _18 = (unsigned int) _22; if (_18 <= 3) // we could substitute (b.2_1 == _26) here goto <bb 3>; [33.00%] else goto <bb 4>; [67.00%] <bb 3> [local count: 354334800]: foo (); the last branch could be simplified by VRP, first we fail to replace it by _18 == 0 which would then fold to _22 == 0 and then further to the suggested compare. Note I see -O2 and -O1 optimizing this just fine. DOM2 threads a jump here but some differences start earlier already. It really looks some phase-ordering bad luck triggers here, mainly triggered by early inlining differences.