https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97421
Bug ID: 97421 Summary: [10/11 Regression] aarch64: Wrong code with -O2 -fmodulo-sched since r10-1318-ga7e8a46 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- AArch64 GCC miscompiles the following testcase: int a, b, d, e; int *volatile c = &a; __attribute__((noinline)) void f(void) { for (int g = 2; g >= 0; g--) { d = 0; for (b = 0; b <= 2; b++) ; e = *c; } } int main(void) { f(); if (b != 3) __builtin_abort(); } with -O2 -fmodulo-sched since r10-1318-ga7e8a463cd1dbaccf6e7c4fa888768fcd257a30f. Removing -fmodulo-sched, the testcase is compiled correctly (tested at -O0 through -O3). Here is the generated code (current trunk) for f at -O2: f: mov w1, 3 adrp x0, .LANCHOR0 adrp x3, .LANCHOR1 mov w4, w1 add x0, x0, :lo12:.LANCHOR0 // x0 <- &d add x3, x3, :lo12:.LANCHOR1 // x3 <- &c .L2: ldr x2, [x3] // x2 <- c stp wzr, w4, [x0] // d = 0, b = 3 subs w1, w1, #1 ldr w2, [x2] // w2 <- *c str w2, [x0, 8] // e <- w2 bne .L2 ret and the code for f with -O2 -fmodulo-sched: f: mov w1, 2 adrp x0, .LANCHOR0 mov w4, w1 add x0, x0, :lo12:.LANCHOR0 adrp x3, .LANCHOR1 add x3, x3, :lo12:.LANCHOR1 ldr x2, [x3] // x2 <- c str w1, [x0, 4] // b = 2 .L2: str wzr, [x0] // d = 0 ldr w2, [x2] // w2 <- *c subs w1, w1, #1 str w2, [x0, 8] // e <- w2 ldr x2, [x3] // x2 <- c str w4, [x0, 4] // b = 2 bne .L2 ldr w2, [x2] // w2 <- *c str w2, [x0, 8] // e <- w2 ret In the latter case we end up with b = 2 and make the call to __builtin_abort().