https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85475
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- In dom3 I see Applying pattern match.pd:2585, gimple-match.c:10461 Applying pattern match.pd:2585, gimple-match.c:10461 Applying pattern match.pd:2585, gimple-match.c:10461 Applying pattern match.pd:2585, gimple-match.c:10461 Applying pattern match.pd:2585, gimple-match.c:10461 Applying pattern match.pd:2585, gimple-match.c:10461 Applying pattern match.pd:2585, gimple-match.c:10461 Applying pattern match.pd:2585, gimple-match.c:10461 Applying pattern match.pd:2585, gimple-match.c:10461 Applying pattern match.pd:2585, gimple-match.c:10461 ... it is for example matching on (le_10 * 2) * (le_10 * 2), transforming it to (le_10 * le_10) * 4 which fails because of the :s. But then DOM faces the situation where the loop is fully unrolled and we have nj (int le) { int zb; unsigned int ivtmp_1; int _6; unsigned int ivtmp_16; unsigned int ivtmp_100; <bb 2> [local count: 63136020]: le_14 = le_3(D) * 2; zb_15 = 1; ivtmp_16 = 15; le_20 = le_14 * 2; zb_21 = 2; le_26 = le_20 * 2; zb_27 = 3; le_32 = le_26 * 2; zb_33 = 4; le_38 = le_32 * 2; zb_39 = 5; le_44 = le_38 * 2; zb_45 = 6; le_50 = le_44 * 2; zb_51 = 7; le_56 = le_50 * 2; zb_57 = 8; le_62 = le_56 * 2; zb_63 = 9; le_68 = le_62 * 2; zb_69 = 10; le_74 = le_68 * 2; zb_75 = 11; le_80 = le_74 * 2; zb_81 = 12; le_86 = le_80 * 2; zb_87 = 13; le_92 = le_86 * 2; zb_93 = 14; le_98 = le_92 * 2; zb_99 = 15; ivtmp_100 = 1; le_4 = le_98 * 2; zb_5 = 16; ivtmp_1 = 0; _6 = le_4 * le_4; return _6; } with DOM folding le_4 * le_4 via gimple_fold_stmt_to_constant_1. You can trivially see the exponential work we do when applying the pattern if you consider that we re-simplify each simplification result. /* Reassociate (X * CST) * Y to (X * Y) * CST. This does not introduce signed overflow for CST != 0 && CST != -1. */ (simplify (mult:c (mult:s @0 INTEGER_CST@1) @2) (if (TREE_CODE (@2) != INTEGER_CST && !integer_zerop (@1) && !integer_minus_onep (@1)) (mult (mult @0 @2) @1))) The issue is that we get into non-linear complexity once there's more than a single use of the inner multiplication, so an explicit single_use fixes that.