Using gcc 4.4.4 -Os on
loop(long *to, long *from, long len)
{
        for (; len; --len)
                *++to = *++from;
}
I get
/* gcc 4.4.4 -Os
loop:
        addi 5,5,1
        li 9,0
        mtctr 5
        b .L2
.L3:
        lwzx 0,4,9
        stwx 0,3,9
.L2:
        addi 9,9,4
        bdnz .L3
        blr
 */
gcc 3.4.6 has:
/* gcc 3.4.6 -Os
loop:
        mr. 0,5
        mtctr 0
        beqlr- 0
.L8:
        lwzu 0,4(4)
        stwu 0,4(3)
        bdnz .L8
        blr
 */

It doesn't matter which cpu type I use. It seems impossible
to make gcc produce small/faster code with newer gcc.

Perhaps lwzx/stwx is faster on bigger Power cpus but this
can be true for all cpus, can it?
That should matter though because I asked gcc to produce smaller
code with -Os

    Jocke

Reply via email to