I've now some better timing result, with less loop overhead.
MMD add PerlInt PerlInt 0.698083 MMD add PerlInt INTVAL 0.601620 MMD sub PerlInt PerlInt 0.513797 MMD sub PerlInt INTVAL 0.457096 PIR add PerlInt PerlInt 7.789893 PIR sub PerlInt PerlInt 4.448320
These are 5 million add or sub instructions on AMD 800, --optimize build. The add isn't cached, the sub opcode is cached. Last two lines are with overloaded MMD functions.
The speedups are ~34% and 77% respectively. The latter number is remarkable - due to the recompilation of the sub opcode, it get's
converted to a direct (invokecc-like) function call w/o entering a second run-loop.
#! perl5_80 use overload '-' => \&my_sub; # here we come (14.5 seconds)
leo