Paolo 'Blaisorblade' Giarrusso <p.giarru...@gmail.com> added the comment:

> You may want to check out issue1408710 in which a similar patch was
> provided, but failed to deliver the desired results.
It's not really similar, because you don't duplicate the dispatch code.
It took me some time to understand why you didn't change the "goto
fast_next_opcode", but that's where you miss the speedup.

The only difference with your change is that you save the range check
for the switch, so the slowdown probably comes from some minor output
change from GCC I guess.

Anyway, this suggests that the speedup really comes from better branch
prediction and not from saving the range check. The 1st paper I
mentioned simply states that saving the range check might make a small
differences. The point is that sometimes, when you are going to flush
the pipeline, it's like adding a few instructions, even conditional
jumps, does not make a difference. I've observed this behaviour quite a
few times while building from scratch a small Python interpreter.

I guess (but this might be wrong) that's because the execution units
were not used at their fullest, and adding conditional jumps doesn't
make a differeence because flushing a pipeline once or twice is almost
the same (the second flush removes just few instructions). Or something
like that, I'm not expert enough of CPU architecture to be sure of such
guesses.

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue4753>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to