On Fri, May 27, 2016 at 07:45:18AM +0200, Christophe Leroy wrote:
> >>Wouldn't it be better to add nops before the function entry in order to
> >>get the hot loop aligned, instead of adding nops in the middle of the
> >>function ?
> >Why would that be better?  The nops are executed once per function call
> >in either case, there are the same number of nops in either case, and
> >on most CPUs nops aren't actually executed anyway (they are decoded and
> >the thrown away).
> >
> The idea was to not execute them:
> 
> |.balign 16 nop nop _GLOBAL(strcpy) addi      r5,r3,-1 addi   r4,r4,-1 1: 
> lbzu r0,1(r4) cmpwi   0,r0,0 stbu     r0,1(r5) bne    1b blr |

That performs _worse_ on most modern CPUs (the first decode will decode
less, so instructions are available for execution later).  That's why
functions are aligned in the first place!


Segher
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to