Aaron Sherman <[EMAIL PROTECTED]> wrote:
> On Fri, 2004-04-23 at 14:52, Leopold Toetsch wrote:
>>
>> I'd posted that as well. Here again with an O3 build of parrot:

> Oops, missed that. Thanks! I'm shocked by the difference in
> performance... it makes me wonder how efficient the optimization+JIT is
> when the two operations are SO different.

The difference isn't due to bad JIT code. Its simply that an internal
hyper (prefix) opcode with INTVALs or FLOATVALs *and* with the knowledge
of the underlaying array can achieve a lot more then a generalized
scheme of some keyed opcodes plus a loop. These keyed opcodes might
already be optimized like using INTVALs for keys or even (useless and
unimplementable - hi Dan:) multi-keyed opcodes.

The Perl6ish »op« is an example, where generalization doesn't really
help. Introducing distinct vtable slots in every non-aggregate PMC for
unused math operations in aggregates is suboptimal as well as
permutating opcodes with hyper or multi-keyed variants.

Hyper op isn't an opcde, it's a "map aggregate's members to deal with
an opcode" operation. It should be implemented like that.

And that's the difference in performace, I've shown.

> ... I must simply not understand
> what's going on at the lowest level here.

Well, the "lowest level" that is the wide performance range from an
interpreted "everything is an object" language "down" to optimized C
code. Have a look (again) at the mops tests. They reach from 2 MOps to
800 MOps (on Athlon 800). I.e. between these two POVs you have a factor
of 400 in that tight loop case (which of course isnt't typical for RL
programs but it shows the range nethertheless).

leo

Reply via email to