--- Leopold Toetsch <[EMAIL PROTECTED]> wrote: > Nicholas Clark wrote: > > > > So I'm confused. It looks like some bits of perl are incredibly sensitive > to > > cache alignment, or something similar. > > > This reminds me on my remarks on JITed mops.pasm which variied ~50% (or > more) depending on the position of the loop in memory. s. near the end > of jit/i386/jit_emit.h. > > > And no, I still don't know what's goin on. > > > (The story for perl5-porters + my comment: > the loop is just 1 subtraction and a conditional jump. Inserting nops > before this loop has drastic imapt on performance. below is the gdb > output of the loop) > > /* my i386/athlon has a drastic speed penalty for what? > * not for unaligned odd jump targets > * > * But: > * mops.pbc 790 => 300-530 if code gets just 4 bytes bigger > * (loop is at 200 instead of 196 ???) > * > * FAST: > * 0x818100a <jit_func+194>: sub %edi,%ebx > * 0x818100c <jit_func+196>: jne 0x818100a <jit_func+194) > * > * Same fast speed w/o 2nd register > * 0x8181102 <jit_func+186>: sub 0x8164c2c,%ebx > * 0x8181108 <jit_func+192>: jne 0x8181102 <jit_func+186> > * > * SLOW (same slow with register or odd aligned) > * 0x818118a <jit_func+194>: sub 0x8164cac,%ebx > * 0x8181190 <jit_func+200>: jne 0x818118a <jit_func+194> > * > */ > > > Nicholas Clark > > leo
The slow one has the loop crossing over a 16 byte boundary. Try moving it over a bit. __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com