Aaron Sherman <[EMAIL PROTECTED]> wrote: > On Fri, 2004-04-23 at 14:52, Leopold Toetsch wrote: >> >> I'd posted that as well. Here again with an O3 build of parrot:
> Oops, missed that. Thanks! I'm shocked by the difference in > performance... it makes me wonder how efficient the optimization+JIT is > when the two operations are SO different. The difference isn't due to bad JIT code. Its simply that an internal hyper (prefix) opcode with INTVALs or FLOATVALs *and* with the knowledge of the underlaying array can achieve a lot more then a generalized scheme of some keyed opcodes plus a loop. These keyed opcodes might already be optimized like using INTVALs for keys or even (useless and unimplementable - hi Dan:) multi-keyed opcodes. The Perl6ish »op« is an example, where generalization doesn't really help. Introducing distinct vtable slots in every non-aggregate PMC for unused math operations in aggregates is suboptimal as well as permutating opcodes with hyper or multi-keyed variants. Hyper op isn't an opcde, it's a "map aggregate's members to deal with an opcode" operation. It should be implemented like that. And that's the difference in performace, I've shown. > ... I must simply not understand > what's going on at the lowest level here. Well, the "lowest level" that is the wide performance range from an interpreted "everything is an object" language "down" to optimized C code. Have a look (again) at the mops tests. They reach from 2 MOps to 800 MOps (on Athlon 800). I.e. between these two POVs you have a factor of 400 in that tight loop case (which of course isnt't typical for RL programs but it shows the range nethertheless). leo