Leopold Toetsch wrote:
Nick Glencross <[EMAIL PROTECTED]> wrote:


Still some way off the OS md5sum, which is typically 0.15 seconds, about
12x quicker. That may sound quite a bit, but much of it can probably be
accounted for by inefficiencies in my conversion to parrot code (a
slightly awkward rol, and perhaps the manipulation of the arrays).


A new opcode C<rol> would certainly help, yes. It would replace 4
instructions (each roughly executed once per char) with one instruction.

A JITted C<rol> opcode should give a speed up of one forth - which is a
lot.

I was a bit too optimistic with my assumptions here. I have now implemented a C<rot> opcode and the one used signature for MD5 as a JIT opcode for x86. But the speedup is much smaller: around 5%.


md5sum of perl-5.8.0.tar.gz  size=11023084

md5sum         0.11 user,  0.20 real
parrot -j      2.63 user   2.68 real
parrot -j rot  2.57 user   2.75 real

The problem with md5 code and Parrot JIT seems to be related to the register allocator. md5 code is one big basic block of integer code. As we don't do any register renaming, the CPU-register usage especially on x86 is suboptimal.

leo



Reply via email to