On Tue, Oct 26, 2021 at 04:48:17AM +1100, Chris Angelico wrote:
> The problem is the bizarre inconsistencies that can come up, which are
> difficult to explain unless you know exactly how everything is
> implemented internally. What exactly is the difference between these,
> and why should some be legal and others not?
They should all be legal. Legal doesn't mean "works". Code that raises
an exception is still legal code.
> def f1(x=>y + 1, y=2): ...
> def f2(x=>y + 1, y=>2): ...
> def f3(x=>y + 1, *, y): ...
> def f4(x=>y + 1): y = 2
> def f5(x=>y + 1):
> global y
> y = 2
What "bizarre inconsistencies" do you think they have? Each example is
different so it is hardly shocking if they behave different too.
f1() assigns positional arguments first (there are none), then
keyword arguments (still none), then early-bound defaults left to
right (y=2), then late-bound defaults left to right (x=y+1).
That is, I argue, the most useful behaviour. But if you insist on a
strict left-to-right single pass to assign defaults, then instead it
will raise UnboundLocalError because y doesn't have a value.
Just like the next case: f2() assigns positional arguments first (there
are none), then keyword arguments (still none), then early-bound
defaults left to right (none of these either), then late-bound defaults
left to right (x=y+1) which raises UnboundLocalError because y is a
local but doesn't have a value yet.
f3() assigns positional arguments first (there are none), then
keyword arguments (still none), at which point it raises TypeError
because you have a mandatory keyword-only argument with no default.
f4() is just like f2().
And lastly, f5() assigns positional arguments first (there are none),
then keyword arguments (still none), then early-bound defaults left to
right (none of these either), then late-bound defaults left to right
(x=y+1) which might raise NameError if global y doesn't exist, otherwise
it will succeed.
Each of those cases is easily understandable. There is no reason to
expect the behaviour in all four cases to be the same, so we can hardly
complain that they are "inconsistent" let alone that they are "bizarrely
inconsistent".
The only novelty here is that functions with late-binding can raise
arbitrary exceptions, including UnboundLocalError, before the body of
the function is entered. If you don't like that, then you don't like
late-bound defaults at all and you should be arguing in favour of
rejecting the PEP :-(
If we consider code that already exists today, with the None sentinel
trick, each of those cases have equivalent errors today, even if some of
the fine detail is different (e.g. getting TypeError because we attempt
to add 1 to None instead of an unbound local).
However there is a real, and necessary, difference in behaviour which I
think you missed:
def func(x=x, y=>x) # or func(x=x, @y=x)
The x=x parameter uses global x as the default. The y=x parameter uses
the local x as the default. We can live with that difference. We *need*
that difference in behaviour, otherwise these examples won't work:
def method(self, x=>self.attr) # @x=self.attr
def bisect(a, x, lo=0, hi=>len(a)) # @hi=len(a)
Without that difference in behaviour, probably fifty or eighty percent
of the use-cases are lost. (And the ones that remain are mostly trivial
ones of the form arg=[].) So we need this genuine inconsistency.
If you can live with that actual inconsistency, why are you losing sleep
over behaviour (functions f1 through f4) which isn't actually inconsistent?
* Code that does different things is supposed to behave differently;
* The differences in behaviour are easy to understand;
* You can't prevent the late-bound defaults from raising
UnboundLocalError, so why are you trying to turn a tiny subset
of such errors into SyntaxError?
* The genuine inconsistency is *necessary*: late-bound expressions
should be evaluated in the function's namespace, not the surrounding
(global) namespace.
> And importantly, do Python core devs agree with less-skilled Python
> programmers on the intuitions?
We should write a list of the things that Python wouldn't have if the
intuitions of "less-skilled Python programmers" was a neccessary
condition.
- no metaclasses, descriptors or decorators;
- no classes, inheritence (multiple or single);
- no slices or zero-based indexing;
- no mutable objects;
- no immutable objects;
- no floats or Unicode strings;
etc. I think that, *maybe*, we could have `print("Hello world")`, so
long as the programmer's intuition is that print needs parentheses.
> If this should be permitted, there are two plausible semantic meanings
> for these kinds of constructs:
>
> 1) Arguments are defined left-to-right, each one independently of each other
> 2) Early-bound arguments and those given values are defined first,
> then late-bound arguments
>
> The first option is much easier to explain, but will never give useful
> results for out-of-order references (unless it's allowed to refer to
> the containing scope or something). The second is closer to the "if x
> is None: x = y + 1" equivalent, but is harder to explain.
You just explained it perfectly in one sentence.
The two options are equally easy to explain. The second takes a few more
words, but the concepts are no harder. And the second is much more
useful.
In comparison, think about how hard it is to explain your preferred
behaviour, a SyntaxError. Think about how many posts you have written,
and how many examples you have given, hundreds maybe thousands of words,
dozens or hundreds of sentences, and you have still not convinced
everyone that "raise SyntaxError" is the right thing to do.
"Why does this simple function definition raise SyntaxError?" is MUCH
harder to explain than "Why does a default value that tries to access an
unbound local variable raise UnboundLocalError?".
> Two-phase initialization is my second-best preference after rejecting
> with SyntaxError, but I would love to see some real-world usage before
> opening it up. Once permission is granted, it cannot be revoked, and
> it might turn out that one of the other behaviours would have made
> more sense.
Being cautious about new syntax is often worthy, but here you are being
overcautious. You are trying to prohibit something as a syntax error
because it *might* fail at runtime. We don't even protect against things
that we know *will* fail!
x = 1 + 'a' # Not a syntax error.
In this case, two-pass defaults is clearly superior because it would
allow everything that the one-pass behaviour would allow, *plus more*
applications that we haven't even thought of yet (but others will).
Analogy:
When Python 1 was first evolving, nobody said that we ought to be
cautious about parallel assignment:
a, b, c = ...
just because the user might misuse it.
a = 1
if False:
b = 1 # oops I forgot to define b
a, b = b, a # SyntaxError just in case
Nor did we lose sleep over which parallel assignment model is better,
and avoid making a decision:
a, b = b, a
# Model 1:
push b
push a
swap
a = pop stack
b = pop stack
versus:
# Model 2:
push b
a = pop stack
push a
b = pop stack
The two models are identical if the expressions on the right are all
distinct from the targets on the left, e.g. `a, b = x, y`, but the first
model allows us to do much more useful things that the second doesn't,
such as the "swap two variables" idiom.
Be bold! The "two pass" model is clearly better than the "one pass"
model. You don't need to prevaricate just in case.
Worst case, the Steering Council will say "Chris we love everything
about the PEP except this..." and you will have to change it. But they
won't because the two pass model is clearly the best *wink*
--
Steve
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/7UGFH4SF47E2EIJ2FDB3OCBHRYLO3AYF/
Code of Conduct: http://python.org/psf/codeofconduct/