I was looking at the list of bytecode instructions that Python uses and I noticed how much it looked like assembly. So I figured it can't be to hard to convert this to actual machine code, to get at least a small boost in speed.

And so I whipped up a proof of concept, available at https://github.com/Rouslan/nativecompile

I'm aware that PyPy already has a working JIT compiler, but I figure it will be a long time before they have a version of Python that is ready for everybody to use, so this could be useful in the mean time.

I chose to create this for the latest stable version of Python and I happen to use some functionality that is only available since Python 3.2.

The basic usage is:

>>> import nativecompile
>>> bcode = compile('print("Hello World!")','<string>','exec')
>>> mcode = nativecompile.compile(bcode)
>>> mcode()
Hello World!


This compiler does absolutely nothing clever. The only difference between the bytecode version and the compiled version is there is no interpreter loop and the real stack is used instead of an array.

Most of it is written in Python itself. There is one module written in C that does the things that cannot easily be done in pure Python, such as get the addresses of API functions and to execute the newly created code.

So far I have only implemented a few bytecode instructions and only have 32-bit x86-compatible support. I have only tested this on Linux. It might work on Windows but only if you can run programs without any sort of data execution prevention (I can fix that if anyone wants). And I'm sure more optimized machine code can be generated (such as rearranging the error checking code to work better with the CPU's branch predictor).

Since so few instructions are implemented I haven't done any benchmarks.


What do people think? Would I be wasting my time going further with this?
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to