------- Comment #39 from pthaugen at gcc dot gnu dot org 2010-07-21 21:51 ------- (In reply to comment #38) > > .L2: > addi 11,8,9216 > ldx 0,10,9 > stdx 0,11,9 > addi 9,9,8 > bdnz .L2 > > and in r161844: > > .L2: > ldu 0,8(11) > stdu 0,8(9) > bdnz .L2 > > I'm no expert on powerpc architecture, but 3 instructions versus 5 looks like > a > win to me. Bit-rotten test case? >
The 'addi 11,8,9216' in the first loop is invariant and should be hoisted out of the loop. Separate issue? As for the issue of indexed ld/st+addi vs. update-form ld/st. The update forms are cracked into ld/st+addi which imposes a scheduling restriction on them (cracked insns start a dispatch group). May not make any difference in this simple loop, but indexed ld/st+addi may have better scheduling opportunities were there more insns in the loop. This testcase also appears to be dependent on -mcpu value. Specifying -mcpu=power7 the testcase passes (although there's still the issue of invariant addi in the loop). And if I change to use -m32, then it only fails for -mcpu=power6. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29256