On Sun, 09 Apr 2017 19:05:35 +1000, Chris Angelico wrote:
> When people talk about making a restricted optimizable subset of Python, > the implication (if not the explicit declaration) is that it's done > strictly by removing, not by rewriting. Maybe *some* people have a naive, if not foolish, idea that all you need to do to speed up CPython is to go through the C code and hit delete over any functions you don't want. But that's hardly a technique that is going to work: not only will you break features, but merely removing code doesn't magically make the rest of the code base faster. At the very least, you have to re-write the code that depends on the features you have just removed. If we're going to talk about speeding up Python, we ought to talk about *serious* approaches to the problem, not the musing of random, ignorant bloggers and other naive commentators. So let's look at what the experts have done: PyPy: a complete re-write, with a completely different approach (JIT compilation). WPython (alas, now more or less abandoned): a complete re-write of the virtual machine, using 16-bit "wordcodes" instead of 8 bit "bytecodes", nevertheless this enabled a lot of good optimizations: http://repository.root-me.org/Programmation/Python/EN%20-%20Beyond% 20python%20bytecode.pdf https://storage.googleapis.com/google-code-archive-downloads/v2/ code.google.com/wpython2/Cleanup%20and%20new%20optimizations%20in% 20WPython%201.1.pdf Unladen Swallow: among other changes, added new, specialist byte-codes to speed up common operations. http://qinsb.blogspot.com.au/2011/03/unladen-swallow-retrospective.html Nuitka: a static optimizing compiler, and a complete re-write. Nuitka implements a number of compile-time optimizations while still keeping (claimed) 100% compatibility with CPython. http://nuitka.net/doc/user-manual.html FATPython adds compile-time name lookups, specialised functions, dead- code elimination, constant folding and propagation to a CPython base. https://haypo.github.io/fat-python-status-janv12-2016.html Cython: a *superset*, not subset, of Python. I could continue, but you get the point. Some of these are experimental, like FATPython; some were production ready but abandoned because the community had little interest (wpython); some are actively maintained and used in production by many people (cython, PyPy). The Python ecosystem is actually quite healthy, if you need to speed up code there are lots of solutions, and some of them are even good solutions. Nevertheless, it is interesting to discuss whether or not any of these features will go mainstream or make it into CPython. But anyway... it's not sufficient to just delete features to speed up a language. You have to change implementation, and very likely add entirely new internal APIs for the compiler. Creating a restricted subset of Python, and giving up compatibility with ordinary Python, is certainly an opinion. That's what Facebook did with their version of PHP. But its not the only approach. > A couple more quotes from the article: > >> It should be possible to define a subset of the Python language, >> uninspiredly dubbed “TurboPython”, that excludes those features that >> stand in the way of high-performance JIT execution (or compilation). >> Not using some of these features coincides with good design practices, >> so it doesn’t necessarily have to be all bad. > ... >> Since TurboPython is a subset of Python, it will also run on Python >> interpreters, albeit slower. > > Both of these imply that the standard library of TurboPython is > *exactly* the standard library of CPython, No. It doesn't say anything about the standard library. Maybe TurboPython has no standard library at all. The library is independent of the interpreter, and until somebody actually puts the effort in of writing it (or copying it from one repo to another), we are equally justified in saying any of: - TurboPython's std lib will have 100% functional compatibility with Python, only the implementation may change; - TurboPython has no std lib, it only has the built-in types and modules, if you want more, you can port it yourself; - TurboPython's std lib is a subset of the Python std lib, just as TurboPython itself is a subset of Python; - TurboPython's std lib is a MILLION TIMES BETTER than Python's. Since TurboPython doesn't exist, any of those statements are equally true. But in a *practical* sense, for our mythical/hypothetical TurboPython to have a chance at going mainstream, it needs a record/struct type, which in Python is namedtuple. That does not necessarily mean it needs to exec() arbitrary code. It just needs a way to create classes dynamically. Even Java can do that. http://stackoverflow.com/questions/2320404/ > minus the bits that aren't > supported. We just cut a few of those nasty dynamic features out of the > language, and voila! it becomes faster. That's exactly what I'm talking about. Anyone who says "voila! it becomes faster" and means it is too foolish to live :-) Cutting out the dynamic features is just the first step in allowing you to re-write the interpreter with more optimizations. A non-optimizing compiler doesn't magically become optimizing just because you've deleted some code from the interpreter. > (Or in this case, > JITable.) There's nothing suggested here about reimplementing existing > features in a different way, Except for common bloody sense :-P [...] > But again, you certainly COULD reimplement literal_eval, but then you > have to keep your implementation accurate and up-to-date, else you risk > bugs creeping in. It's a non-trivial task to rewrite these kinds of > things and maintain parallel versions; Sure, it's not thirty seconds work, but it's not *hard* to write a parser that accepts a small subset of Python expressions (those involving only literals and a handful of built-in displays). Are you aware that literal_eval is *not* the same code that the Python interpreter uses for evaluating literals? They are kept in sync manually, or not kept in sync as the case may be. Python 2.7 didn't support the set display syntax even though Python did: >>> from ast import literal_eval >>> {1, 2} set([1, 2]) >>> literal_eval("{1, 2}") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python2.7/ast.py", line 80, in literal_eval return _convert(node_or_string) File "/usr/lib/python2.7/ast.py", line 79, in _convert raise ValueError('malformed string') ValueError: malformed string literal_eval also had a bug that allowed it to evaluate the + and - operator, needed to support complex numbers: http://stackoverflow.com/questions/40584417/why-does-ast-literal-eval5-7 The point being, while it's not *ideal* that support for evaluating Python literals needs to be written twice (once in the real interpreter, once in literal_eval) its not unusual or uncommon or a particularly onerous requirement. > by calling on compile(), the > current implementation *for free* is kept up-to-date with changes in > CPython's grammar. That's incorrect. literal_eval() needs a whitelist of what's allowed, and a blacklist of what is allowed by the whitelist but shouldn't be. There may be other work needed as well. See the source code :-) [...] > Yes. As long as it can know when the more dynamic features are being > used - and that's the hard part. Hard, but not impossible, and not that hard. Look at Nuitka and FATPython. The thing that people forget is that not all errors are equal. There are two ways that the optimizing compiler can do the wrong thing: - fail to apply an optimization when it is safe to do so; - apply an optimization when it is unsafe to do so. Those errors are not equivalent. In the first case, all that happens is that you lose an opportunity to be a little bit faster, but the executed code is still correct. (Or at least no less correct than what you wrote.) If you even notice, which you might not, it might be tricky to work out why the compiler isn't applying the optimization, but in principle it can be done and then worked around, or the compiler made a bit smarter. For example, it might be that the compiler applies an absolutely dead- simple rule: *any* use of eval or exec disables *all* optimizations. Don't like it? Then don't use eval or exec. If that's not an option for you, patches are welcome. The second case is much worse. If you're lucky, the "optimized" code will merely crash, and you'll know it is broken. But worse, it may be subtly incorrect, and do the wrong thing. Calculate a *very slightly off* result. Or transfer funds to the wrong account, delete the wrong files, turn off the wrong life-support machine, that sort of thing. The second case must be avoided at all costs. (Or at least, unless the programmer consents to unsafe optimizations that may change behaviour, such as many unsafe floating point optermizations.) The first case merely indicates room for improvement. Patches are welcome. > Remember, as soon as you have a single > class implemented in Python, it could have a method injected into it > without your knowledge. Can you detect that statically, or can you use > the versioning of __dict__ to notice that something's been broken? What > makes you fall back to the default implementation? Ask Victor Stinner. You ask these rhetorical questions as if they're major impediments, but they actually are more-or-less solved problems. The tricky part is just applying the solutions to *Python* itself. It's one thing to have a general strategy for optimizations, its another to apply it to a real language, and especially a real implementation. But effectively, yes, I believe that dict versioning can be used to do this. Suppose you have code that says: result = myinstance.method(99) So you compile byte-code something like this pseudo-code: lookup myinstance if type(myinstance) is the expected type and neither the type nor instance __dict__ have changed: push 99 on the stack JUMP TO CODE AT 123456 # original location of method's code return else: get myinstance lookup method in myinstance.__dict__ if not found: lookup method in each of type(myisinstance).__mro__ or fail wrap the found object in MethodType build a set of positional and keyword arguments pass those positional and keyword arguments to method object which unpacks them and assigns to parameters location = look up the method's code object set up the book-keeping needed JUMP TO CODE AT location # probably 123456 as above tidy up book-keeping return The first case avoids a HUGE amount of work, at the cost of checking a couple of guards ahead of time. So the fast path becomes (say) 20% faster, and the slow path becomes 2% slower. Even if you take the slow path nine times as often as the fast path, it's still an overall win. But you don't get that just be selecting the C code for eval() and exec() in your editor and pressing Delete. -- Steve -- https://mail.python.org/mailman/listinfo/python-list