> I think it also improves branch target prediction - if you have a tight
> loop of a few opcodes the predictor can guess where you're headed (since
> there is a separate lookup key for each opcode), whereas with the
> original code, there's a single key which cannot be used to predict the
> branch target.

At least usually. I caught at least one version of gcc to CSE the jump
instruction into a common location in one of my interpreters
:-( But not all do it at least, and I hope gcc gets fixed.

It's still faster for other reasons usually.

-Andi


Reply via email to