Re: benchmarking - it's now all(-1,0,1,5,6)% faster

Mr. Nobody Sat, 11 Jan 2003 15:05:30 -0800

--- Leopold Toetsch <[EMAIL PROTECTED]> wrote:
> Nicholas Clark wrote:
> 
> 
> > So I'm confused. It looks like some bits of perl are incredibly sensitive
> to
> > cache alignment, or something similar.
> 
> 
> This reminds me on my remarks on JITed mops.pasm which variied ~50% (or 
> more) depending on the position of the loop in memory. s. near the end 
> of jit/i386/jit_emit.h.
> 
> 
> And no, I still don't know what's goin on.
> 
> 
> (The story for perl5-porters + my comment:
>   the loop is just 1 subtraction and a conditional jump. Inserting nops 
> before this loop has drastic imapt on performance. below is the gdb 
> output of the loop)
> 
> /* my i386/athlon has a drastic speed penalty for what?
>   * not for unaligned odd jump targets
>   *
>   * But:
>   * mops.pbc 790 => 300-530  if code gets just 4 bytes bigger
>   * (loop is at 200 instead of 196 ???)
>   *
>   * FAST:
>   * 0x818100a <jit_func+194>:    sub    %edi,%ebx
>   * 0x818100c <jit_func+196>:    jne    0x818100a <jit_func+194)
>   *
>   * Same fast speed w/o 2nd register
>   * 0x8181102 <jit_func+186>:    sub    0x8164c2c,%ebx
>   * 0x8181108 <jit_func+192>:    jne    0x8181102 <jit_func+186>
>   *
>   * SLOW (same slow with register or odd aligned)
>   * 0x818118a <jit_func+194>:    sub    0x8164cac,%ebx
>   * 0x8181190 <jit_func+200>:    jne    0x818118a <jit_func+194>
>   *
>   */
>
> > Nicholas Clark
>
> leo


The slow one has the loop crossing over a 16 byte boundary. Try moving it
over a bit.

__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

Re: benchmarking - it's now all(-1,0,1,5,6)% faster

Reply via email to