https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93007
Bug ID: 93007 Summary: [10 regression] pr77698.c testcase fails due to block commoning Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: wilco at gcc dot gnu.org Target Milestone: --- Since r276960 we see this failure on Arm: FAIL: gcc.dg/tree-prof/pr77698.c scan-rtl-dump-times alignments "internal loop alignment added" 1 The issue appears to be that basic block commoning works on an unrolled loop, which is unlikely to be beneficial for performance: .L17: adds r0, r0, #1 b .L27 .L6: ldr r4, [r2, #12] adds r0, r0, #4 ldr lr, [r1] str lr, [r3, r4, lsl #2] ldr r4, [r2, #12] ldr lr, [r1] str lr, [r3, r4, lsl #2] ldr r4, [r2, #12] ldr lr, [r1] str lr, [r3, r4, lsl #2] .L27: ldr r4, [r2, #12] cmp ip, r0 ldr lr, [r1] str lr, [r3, r4, lsl #2] bne .L6 pop {r4, pc} The test could be easily fixed, but ensuring block commoning takes loops and execution frequencies into account would be better overall.