https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118634
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Last reconfirmed| |2025-01-24 Status|UNCONFIRMED |NEW Ever confirmed|0 |1 --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- It's indeed missed unrolling / jump threading and the resulting optimization. The cunroll size estimate doesn't properly "simulate" through the body, there's also <bb 3> [local count: 715863672]: # counter_30 = PHI <counter_25(17), 0(2)> # iter_22 = PHI <iter_26(17), _1(2)> # ivtmp_55 = PHI <ivtmp_33(17), 2(2)> if (counter_30 != 2) goto <bb 4>; [50.00%] which I think is statically known to evaluate to false. What's a bit unfortunate is that we run VRP before loop header copying but not after. I do have somewhere a prototype to make cunroll size estimates more precise by simulating the loop (to account for unreachable parts). It's all optimized away with -O3.