On Sun, Apr 9, 2017 at 5:39 PM, Steven D'Aprano <st...@pearwood.info> wrote: > On Sun, 09 Apr 2017 13:57:28 +1000, Chris Angelico wrote: >>>From that page: >> >>> Other candidates for banishment from TurboPython include eval and exec. >> >> Bye bye namedtuple. > > All that would mean is that the implementation of namedtuple would have > to change. It would probably require some sort of support from the > compiler, but that's okay, the compiler can do (nearly) anything. > > Something as important and useful as namedtuple would not be dropped from > this hypothetical TurboPython. It would just shift the implementation > from pure-Python to something else. > > exec() is actually only needed for a *tiny* bit of namedtuple. The > original implementation by Raymond Hettinger takes the easy way out by > using exec on the entire class definition, but it is trivially easy to > restrict that to just the class __new__ method: > > https://code.activestate.com/recipes/578918-yet-another-namedtuple > > > but even that could be avoided with a minimal amount of help from the > compiler.
When people talk about making a restricted optimizable subset of Python, the implication (if not the explicit declaration) is that it's done strictly by removing, not by rewriting. A couple more quotes from the article: > It should be possible to define a subset of the Python language, > uninspiredly dubbed “TurboPython”, that excludes those features > that stand in the way of high-performance JIT execution (or > compilation). Not using some of these features coincides with > good design practices, so it doesn’t necessarily have to be all bad. ... > Since TurboPython is a subset of Python, it will also run on Python > interpreters, albeit slower. Both of these imply that the standard library of TurboPython is *exactly* the standard library of CPython, minus the bits that aren't supported. We just cut a few of those nasty dynamic features out of the language, and voila! it becomes faster. (Or in this case, JITable.) There's nothing suggested here about reimplementing existing features in a different way, with the consequent possibility of having slightly different behaviour - a Python script is guaranteed to have the same semantics on TurboPython as on CPython. This is the kind of description used by Asm.js (http://asmjs.org), defined as a strict subset of JavaScript that can be implemented very efficiently, but with an absolute guarantee that the semantics will be identical. >> And compile() is going to have to go, > > Indeed. > > >> since you >> could implement eval/exec by creating a new function: >> >>>>> runme = r""" >> print("Hello, world!") >> import sys sys.stdout.write("I can import modules.\n") >> """ >>>>> type(lambda: 1)(compile("def f():" + runme, "exec", >>>>> "exec").co_consts[0], globals())() >> Hello, world! >> I can import modules. >> >> So if compile() goes, you also lose ast.parse, > > Probably. The only way to *not* lose ast.parse() is to completely reimplement it or compile(), which is not a good idea IMO (unless you can somehow dump a pure-Python compile() out of an existing source), and which is not at all implied to be part of the proposal. >> which means you lose introspection tools, > > *Some* introspection tools. Python has many that don't rely on compile or > ast.parse. Yes, that's what I meant - that there will be introspection tools that you lose. Sorry for the unclear language. Obviously there's a lot of introspection that's based on pre-existing features, like the attributes on functions and code objects. >> plus you lose literal_eval and friends. > > I don't know what "friends" you are referring to, but again, if > literal_eval is important, the compiler can support it. If you can > support an entire Python interpreter and compiler in the form of > compile(), then you can support more restricted subset of the language. > Writing a parser to evaluate strings, integers, and a few other data > types is not exactly brain surgery. Other tools built on top of literal_eval. Though I had the feeling there were more of them than a 'git grep' has shown up - it's actually only used in a couple of places in the stdlib. My bad. But again, you certainly COULD reimplement literal_eval, but then you have to keep your implementation accurate and up-to-date, else you risk bugs creeping in. It's a non-trivial task to rewrite these kinds of things and maintain parallel versions; by calling on compile(), the current implementation *for free* is kept up-to-date with changes in CPython's grammar. It wouldn't have needed any changes when the u"..." prefix was re-added to Python 3.3, for instance, because the AST didn't change. >> I'm also >> not sure whether the import machinery would have to be rewritten, but a >> quick 'git grep' suggests that it would. Removing eval and exec is not >> as simple as removing them from the standard library. > > Well of course not, but removing eval and exec is only a necessary, not > sufficient, condition, for enabling a huge bunch of compiler optimizations > and speeding up Python. > > This "TurboPython" would require a completely new implementation of the > Python interpreter to be fast. It's not as if eval and exec are great > heavy concrete weights chained to the leg of the compiler, and all you > need do is remove the chain and the compiler suddenly becomes thirty > times faster. Right. But my point is that it _also_ has to be a completely new implementation of large slabs of the standard library, too. At some point, it's not "Python minus the bits we can't JIT", but it's "a complete implementation of a Python-like language with a restricted stdlib". RPython has already been mentioned. >> In fact, extreme dynamism is baked deep into the language. You'd have to >> make some fairly sweeping language changes to get any real benefits from >> restricting things. > > Well, maybe. As is pointed out many, many times, 99% of Python code > avoids the sorts of extreme dynamism that keeps things slow. Lots of > people would be satisfied with a language *really close* to Python that > was ten or twenty times faster, even if it meant that you couldn't write > code like this: > > > answer = input("What's your name?") > exec("name = %r" % answer) > print(name) Sure. But would they also be happy that dunder methods for operators have different behaviour? I rather suspect that they'll be a target early on. And mandatory type hints don't help, because type hinting guarantees that something is a valid subclass, but optimization depends on the exact type. rosuav@sikorsky:~/tmp$ cat demo.py def add_two_integers(x: int, y: int): tot = x + y print("I am adding", x, "and", y, "to get", tot) return tot # Simple example add_two_integers(5, 7) # Proof that non-integers are rejected # add_two_integers(5.0, 7.0) # Uncomment to get an error from MyPy # Evil example class Int(int): def __add__(self, other): return int(self) + other - 1 add_two_integers(Int(5), 7) rosuav@sikorsky:~/tmp$ mypy demo.py rosuav@sikorsky:~/tmp$ python3 demo.py I am adding 5 and 7 to get 12 I am adding 5 and 7 to get 11 The only way to optimize this function is to somehow know that you're working with actual integers - which probably means you're doing work that can be handed off to numpy. > Even better would be if the compiler was smart enough to use the > optimized, fast runtime when the dynamic features aren't used, and fall > back on a slower implementation only when needed to support the more > dynamic features. Yes. As long as it can know when the more dynamic features are being used - and that's the hard part. Remember, as soon as you have a single class implemented in Python, it could have a method injected into it without your knowledge. Can you detect that statically, or can you use the versioning of __dict__ to notice that something's been broken? What makes you fall back to the default implementation? The problem is the same thing that gives Python a lot of its beauty: that there's very little difference between built-in types and user-defined types. In CPython, types implemented in C are immutable, but other than that, they're basically the same thing as anything you make, and you can inspect them and stuff: >>> int.__add__(5, 7) 12 In contrast, JavaScript has a fundamental difference between "stuff implemented in JavaScript" and "stuff the interpreter gives you". For example, you can create a constructor function and set its .prototype attribute to be some object, which is broadly like subclassing that object; but you can't do that with a Number (the JS floating-point type). You don't get a subclass of Number that you can then tweak the behaviour of; you get a completely different thing, one that doesn't behave like a Number at all. I'm sure a JS expert could tell me how to make an object that behaves like a Number, but it isn't as simple as Python's way: class MyInt(int): ... Pike, too, manages to outperform Python by a notable factor (usually around 3:1 or better) - and its native integer type is also distinctly different from its object types. You can't say: class MyInt { inherit int; } You can inherit from the equivalent object form Gmp.mpz, for those situations where you want an int-like object, but it's not going to perform as well as the native int does. Personally, I think that would be a far better avenue to go down. I'm not sure it would be possible to retrofit this to Python (for example, there are guarantees about object identity that apply to *all objects*), but making the core immutable types into value-only non-objects wouldn't break a lot of code, and might provide for some significant performance improvements to string manipulation and arithmetic operations. But at some point, it's not really Python any more. ChrisA -- https://mail.python.org/mailman/listinfo/python-list