https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79534
--- Comment #12 from James Greenhalgh <jgreenhalgh at gcc dot gnu.org> --- So while there's nothing buggy about the if-conversion which causes the performance issue, it does show an interesting missed optimization that ifcvt can't handle. We make the transform through find_if_case_2, which looks for things of the form: /* TEST BLOCK */ if (test) goto E; // x not live /* FALLTHRU */ /* ELSE BLOCK */ x = big(); goto L; E: /* THEN BLOCK */ x = b; goto M; And transforms them to: /* Unconditional copy of THEN BLOCK */ x = b; /* TEST BLOCK */ if (test) goto M; /* ELSE BLOCK */ x = big(); goto L; In the testcase, using the naming conventions above, and snipping irrelevant details, this looks like: TEST BLOCK (309) ;; basic block 309, loop depth 4, count 0, freq 3153, maybe hot ;; prev block 308, next block 311, flags: (REACHABLE, RTL, MODIFIED) ;; pred: 219 [98.0%] <snip more insns> (insn 1915 1914 1916 309 (set (reg:CC 66 cc) (compare:CC (reg:SI 1117) (const_int 0 [0]))) "foo.cpp":59 391 {cmpsi} (expr_list:REG_DEAD (reg:SI 1117) (nil))) (jump_insn 1916 1915 1922 309 (set (pc) (if_then_else (ne (reg:CC 66 cc) (const_int 0 [0])) (label_ref 1905) (pc))) "foo.cpp":59 9 {condjump} (expr_list:REG_DEAD (reg:CC 66 cc) (int_list:REG_BR_PROB 9558 (nil))) -> 1905) ;; succ: 227 [95.6%] ;; 226 [4.4%] (FALLTHRU) ELSE BLOCK (226) ;; basic block 226, loop depth 4, count 0, freq 201, maybe hot ;; prev block 224, next block 227, flags: (REACHABLE, RTL, MODIFIED) ;; pred: 309 [4.4%] (FALLTHRU) ;; 311 [3.8%] (FALLTHRU) (code_label 1917 1413 1417 226 141 (nil) [0 uses]) (note 1417 1917 1418 226 [bb 226] NOTE_INSN_BASIC_BLOCK) (insn 1418 1417 1905 226 (set (reg:SI 690 [ _1517 ]) (plus:SI (reg:SI 703 [ ivtmp.56D.5375 ]) (const_int -3 [0xfffffffffffffffd]))) 95 {*addsi3_aarch64} (nil)) ;; succ: 237 [100.0%] (FALLTHRU) THEN BLOCK (227) ;; basic block 227, loop depth 4, count 0, freq 3013, maybe hot ;; prev block 226, next block 228, flags: (REACHABLE, RTL, MODIFIED) ;; pred: 309 [95.6%] (code_label 1905 1418 1421 227 140 (nil) [1 uses]) (note 1421 1905 1422 227 [bb 227] NOTE_INSN_BASIC_BLOCK) (insn 1422 1421 1802 227 (set (reg:SI 690 [ _1517 ]) (plus:SI (reg:SI 703 [ ivtmp.56D.5375 ]) (const_int -3 [0xfffffffffffffffd]))) 95 {*addsi3_aarch64} (nil)) ;; succ: 237 [100.0%] (FALLTHRU) So the interesting thing is that the THEN block and the ELSE block are as good as identical! Both compute (plus (reg 703) (const_int -3)) and both fall through to block 237. The normal if-convert machinery won't catch this because basic block 226 (the ELSE block) has multiple predecessors. But the transformation we make through find_if_case_2 ends up looking silly! (again, snipping some unrelated details/insns): TEST BLOCK (279) ;; basic block 279, loop depth 4, count 0, freq 3153, maybe hot ;; prev block 278, next block 280, flags: (REACHABLE, RTL, MODIFIED) ;; pred: 203 [98.0%] <snip some insns> /* Unconditional copy of THEN BLOCK. */ (insn 1422 1914 1915 279 (set (reg:SI 690 [ _1517 ]) (plus:SI (reg:SI 703 [ ivtmp.56D.5375 ]) (const_int -3 [0xfffffffffffffffd]))) 95 {*addsi3_aarch64} (nil)) (insn 1915 1422 1916 279 (set (reg:CC 66 cc) (compare:CC (reg:SI 1117) (const_int 0 [0]))) "foo.cpp":59 391 {cmpsi} (expr_list:REG_DEAD (reg:SI 1117) (nil))) (jump_insn 1916 1915 1922 279 (set (pc) (if_then_else (ne (reg:CC 66 cc) (const_int 0 [0])) (label_ref:DI 1470) (pc))) "foo.cpp":59 9 {condjump} (expr_list:REG_DEAD (reg:CC 66 cc) (int_list:REG_BR_PROB 9558 (nil))) -> 1470) ;; succ: 218 [95.6%] ;; 209 [4.4%] (FALLTHRU) ELSE BLOCK (209): ;; basic block 209, loop depth 4, count 0, freq 201, maybe hot ;; prev block 208, next block 210, flags: (REACHABLE, RTL, MODIFIED) ;; pred: 279 [4.4%] (FALLTHRU) ;; 280 [3.8%] (FALLTHRU) (code_label 1917 1413 1417 209 141 (nil) [0 uses]) (note 1417 1917 1418 209 [bb 209] NOTE_INSN_BASIC_BLOCK) (insn 1418 1417 1802 209 (set (reg:SI 690 [ _1517 ]) (plus:SI (reg:SI 703 [ ivtmp.56D.5375 ]) (const_int -3 [0xfffffffffffffffd]))) 95 {*addsi3_aarch64} (nil)) ;; succ: 218 [100.0%] (FALLTHRU) Note that if we are on the "else" path, we now we compute (plus (reg 703) (const_int -3)) twice! The missed optimisation is that this should be an unconditional branch! There is probably a route to getting your performance back by improving ifcvt to handle this case.