Paul Rubin, 04.08.2012 20:18: > Stefan Behnel writes: >>> C is pretty poor as a compiler target: how would you translate Python >>> generators into C, for example? >> Depends. If you have CPython available, that'd be a straight forward >> extension type. > > Calling CPython hardly counts as compiling Python into C.
CPython is written in C, though. So anything that CPython does can be done in C. It's not like the CPython project used a completely unusual way of writing C code. Besides, I find your above statement questionable. You will always need some kind of runtime infrastructure when you "compile Python into C", so you can just as well use CPython for that instead of reimplementing it completely from scratch. Both Cython and Nuitka do exactly that, and one of the major advantages of that approach is that they can freely interact with arbitrary code (Python or not) that was written for CPython, regardless of its native dependencies. What good would it be to throw all of that away, just for the sake of having "pure C code generation"? >> For the yielding, you can use labels and goto. Given that you generate >> the code, that's pretty straight forward as well. > > You're going to compile the whole Python program into a single C > function so that you can do gotos inside of it? What happens if the > program imports a generator? No, you are going to compile only the generator function into a function that uses gotos, maybe with an additional in-out struct parameter that holds its state. Then, on entry, you read the label (or its ID) from the previous state, reset local variables and jump to the label. On exit, you store the state back end return. Cython does it that way. Totally straight forward, as I said. >>> How would you handle garbage collection? >> CPython does it automatically for us at least. > > You mean you're going to have all the same INCREF/DECREF stuff on every > operation in compiled data? Ugh. If you don't like that, you can experiment with anything from a dedicated GC to transactional memory. >> Lacking that, you'd use one of the available garbage collection >> implementations, > > What implementations would those be? There's the Boehm GC which is > useful for some purposes but not really suitable at large scale, from > what I can tell. Is there something else? No idea - I'll look it up when I need one. Last I heard, PyPy had a couple of GCs to choose from, but I don't know how closely the are tied into its infrastructure. >> or provide none at all. > > You're going to let the program just leak memory until it crashes?? Well, it's not like CPython leaks memory until it crashes, now does it? And it's written in C. So there must be ways to handle this also in C. Remember that CPython didn't even have a GC before something around 2.0, IIRC. That worked quite ok in most cases and simply left the tricky cases to the programmers. It really depends on what your requirements are. Small embedded systems, time critical code and real-time systems are often much better off without garbage collection. It's pure convenience, after all. >> you shouldn't expect too much of a performance gain from what the >> platform gives you for the underlying implementation. It can optimise >> the emulator, but it won't see enough of the Python code to make >> anything efficient out of it. Jython is an example for that. > > Compare that to the performance gain of LuaJIT and it starts to look > like something is wrong with that approach, or maybe some issue inherent > in Python itself. Huh? LuaJIT is a reimplementation of Lua that uses an optimising JIT compiler specifically for Lua code. How is that similar to the Jython runtime that runs *on top of* the JVM with its generic byte code based JIT compiler? Basically, LuaJIT's JIT compiler works at the same level as the one in PyPy, which is why both can theoretically provide the same level of performance gains. >> You can get pretty far with static code analysis, optimistic >> optimisations and code specialisation. > > It seems very hard to do reasonable optimizations in the presence of > standard Python techniques like dynamically poking class instance > attributes. I guess some optimizations are still possible, like storing > attributes named as literals in the program in fixed slots, saving some > dictionary lookups even though the slot contents would have to still be > mutable. Sure. Even when targeting the CPython runtime with the generated C code (like Cython or Nuitka), you can still do a lot. And sure, static code analysis will never be able to infer everything that a JIT compiler can see. Stefan -- http://mail.python.org/mailman/listinfo/python-list