Demur Rumed added the comment:

I'll dig up benchmark results when I get home, but I'd be interested to get 
results on a less wannabe RISC CPU

The change is to have all instructions take an argument. This removes the 
branch on each instruction on whether to load oparg. It then also aligns 
instructions to always be 2 bytes rather than 1 or 3 by having arguments only 
take up 1 byte. In the case that an argument to an instruction is greater than 
255, it can chain EXTENDED_ARG up to 3 times. In practice this rarely occurs, 
mostly only for jumps, & abarnert measured stdlib to be ~5% smaller

The rationale is that this offers 3 benefits: Smaller code size, simpler 
instruction iteration/indexing (One may now scan backwards, as peephole.c does 
in this patch), which between the two results in a small perf gain (the only 
way for perf to be negatively impacted is by an increase in EXTENDED_ARGs, when 
I post benchmarking I'll also post a count of how many more EXTENDED_ARGs are 
emitted)

This also means that if I want to create something like a tracer that tracks 
some information for each instruction, I can allocate an array of codesize/2 
bytes, then index off of half the instruction index. This isn't currently done 
in peephole.c, nor does this include halving jump opargs

I've looked up the 'recent work to cache attribute/global lookup' issue I 
mentioned: http://bugs.python.org/issue26219
I believe that patch would benefit from this one, but it'd be better to get 
Yury's opinion that belief

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26647>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to