On Sun, 01 Apr 2018 14:24:38 -0700, David Foster wrote: > My understanding is that the Python interpreter already has enough > information when bytecode-compiling a .py file to determine which names > correspond to local variables in functions. That suggests it has enough > information to identify all valid names in a .py file and in particular > to identify which names are not valid.
Not even close. The bottom line is, the Python core developers don't want to spent their time writing and maintaining what is effectively a linter. Python is run by a small team of volunteers with relatively little funding, and there are far more important things for them to work on than duplicating the work done by linters. If you want something to check your code ahead of time for undefined names, then run a linter: you have many to choose from. But even if they were prepared to do so, it isn't as easy or cheap as you think. This sort of analysis works for local variables because Python has decided on the rule that *any* binding operation to a local name in a function makes it a local, regardless of whether that binding operation would actually be executed or not. So: def function(): len return None if False: len = len fails with UnboundLocalError. That's the rule for functions, and it is deliberately made more restrictive than for Python code outside of functions as a speed optimization. (In Python 3, the rule is more restrictive than for Python 2: star imports inside functions and unqualified exec are forbidden too.) But it doesn't work for globals unless you make unjustifiable (for the compiler) assumptions about what code contains, or do an extremely expensive whole-application analysis. For example, here's a simple, and common, Python statement: import math Can you tell me what global names that line will add to your globals? If you said only "math", then you're guilty of making those unjustifiable assumptions. Of course, for *sensible* code, that will be the only name added, but the compiler shouldn't assume the code is sensible. Linters can, but the compiler shouldn't. The imported module is not necessarily the standard library `math` module, it could be a user-defined module shadowing it. That module could have side-effects, and those side-effects could include populating the current module (not the fake `math`) with any number of globals, or adding/deleting names from the builtins. So the instant you import a module, in principle you no longer know the state of globals. Of course, in practice we don't do that. Much. But it is intentionally allowed, and it is not appropriate for the compile to assume that we never do that. A linter can assume sensible code, and get away with more false negatives than the compiler can. So here is a partial list of things which could change the global or built-in name spaces, aside from explicit binding operations: - star imports; - importing any module could inject names into builtins or your globals as a side-effect; - calling any function could do the same; - exec; - eval, since it could call exec; - manipulating globals() or locals(); - even under another name, e.g: foo = False or globals # later foo()['surprise'] = 12345 I've probably missed many. Of course sensible code doesn't do horrible things like those (possible excluding the star imports), but the compiler would have to cope with them since they are allowed and sometimes they're useful. Unlike a linter, which can afford to be wrong sometimes, the compiler cannot be wrong or it counts as a compiler bug. Nobody will be too upset if a linter misses some obscure case in obfuscated weird code. But if the compiler wrongly flags an error when the code is actually legal, people will be justifiably annoyed. -- Steve -- https://mail.python.org/mailman/listinfo/python-list