https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61837
luoxhu at gcc dot gnu.org changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |luoxhu at gcc dot gnu.org --- Comment #3 from luoxhu at gcc dot gnu.org --- "addi 8,4,-1" and "subf 9,8,5" could not be hoisted out as there are dependency to "lbzu 9,1(8)". r8 need be initialized to p2-1 in each iteration of outer loop. Only the result of subf 9,8,5 is loop invariant (p2+s-1)-(p2-1). But the latest GCC code could be optimized as A, B, C is loop invariant. foo: .LFB0: .cfi_startproc cmpwi 7,5,0 li 6,0 rldicl 5,5,0,32 li 7,0 .p2align 4,,15 .L2: ble 7,.L7 addi 8,5,-1 // A addi 10,4,-1 rldicl 8,8,0,32 // B mr 9,3 addi 8,8,1 // C mtctr 8 .p2align 5 .L4: lbzu 8,1(10) cmpw 0,8,7 bne 0,.L3 stw 6,0(9) .L3: addi 9,9,4 bdnz .L4 .L7: addi 6,6,88 addi 7,7,1 cmpwi 0,6,8888 extsw 7,7 extsw 6,6 bne 0,.L2 blr