[Python-ideas] Re: Make `del x` an expression evaluating to `x`

Andrew Barnert via Python-ideas Thu, 12 Mar 2020 17:30:29 -0700

On Mar 12, 2020, at 13:22, Marco Sulla <[email protected]> 
wrote:
> 
> On Thu, 12 Mar 2020 at 18:42, Andrew Barnert via Python-ideas
> <[email protected]> wrote:
>> What if a for loop, instead of nexting the iterator and binding the result 
>> to the loop variable, instead unbound the loop variable, nexted the 
>> Iterator, and bound the result to the loop variable?
> 
> I missed that. But I do not understand how this can speed up any loop.
> I mean, if Python do this, it does an additional operation at every
> loop cycle, the unbounding. How can it be faster?


Because rebinding includes unbinding if it was already bound, so the unbinding 
happens either way.

Basically, instead of this pseudocode:

    push it->tp_iternext(it) on the stack
    if f_locals[idx]:
        decref f_locals[idx]
        f_locals[idx] = NULL
    f_locals[idx] = stack pop
    incref f_locals[idx]

… you’d do this:

    if f_locals[idx]:
        decref f_locals[idx]
        f_locals[idx] = NULL
    push it->tp_iternext(it) on the stack
    f_locals[idx] = stack pop
    incref f_locals[idx]

No extra cost (or, if you don’t optimize it out, the only extra cost is 
checking whether the variable is already bound an extra time, which is just 
checking a pointer in an array against NULL), and the benefit is that the 
object is decref’d before you call tp_iter.

Why does this matter? Well, that’s the whole point of the proposal.

A decref may reduce the count to 0. In this case, the object is freed before 
tp_iternext is called, so if tp_iternext needed to do a big allocation for each 
value, the object allocator will probably reuse the last one instead of going 
back to the heap.

A decref may also reduce the count to 1, if the iterator is storing a copy of 
the same object internally. In general this doesn’t help, but if the iterator 
is written in C and it knows the object is a special known-safe type like tuple 
(which is immutable and has no reference borrowing APIs) it can reuse it 
safely. As permutations apparently does.

All that being said, as Guido explained, I don’t think my idea is workable. I 
think what we really want is to release the object before tp->iternext iff it’s 
not going to raise StopIteration, and there’s no way to predict that in advance 
without solving the halting problem, so…

> Furthermore, maybe I can be wrong, but reassigning to a variable
> another object does not automatically unbound the variable from the
> previous object?
> For what I know, Python is a "pass by value", where the value is a
> pointer, like Java.

That’s misleading terminology. Java uses “pass by value” and Ruby uses “call by 
reference” to mean doing the same thing Python does, so describing it as either 
“by value” or “by reference” is just going to confuse as many people as it 
helps. Barbara Liskov explained why it was a meaningless distinction for 
languages that aren’t sufficiently ALGOL-like back around 1980, and I don’t 
know why people keep confusingly trying to force languages to fit anyway 40 
years later. Better to just describe what it does.

> Indeed any python variable is a PyObject*, a
> pointer to a PyObject.

No. Any Python _value_ is a PyObject*. It doesn’t matter whether the value is a 
temporary, stored in a variable, stored in a list element, stored in 17 
different variables, whatever.

And that’s all specific to the CPython implementation. In Jython or PyPy, a 
Python value is a Java object or a Python object in the underlying Python.

So what’s a variable? Well, Python doesn’t have variables in the same sense as 
a language like C++. It has namespaces, that map names to values. A variable in 
Python’s syntax is just a lookup of a name in a namespace in Python’s 
semantics. And a namespace is in general just a dictionary. That’s pretty much 
all there is to variables. (There’s an optimization for locals, which are 
converted into indexes into a C array of values stored on the frame object 
instead, which is why we have all those issues with locals() and exec. And 
there’s also the cell thing for closure variables. And there’s nothing stopping 
you from replacing a namespace’s __dict__ with an object of a different type 
that does almost anything you can imagine. But ignore all of that.) If you 
understand dicts, you understand variables, and you don’t need to mention 
PyObject* to understand dicts (unless you want to use them from the C API).

> When you assign a new object to a variable, what are you really doing
> is change the value of the variable from a pointer to another.

You’re just updating the namespace dict to map the same key to a different 
value.

> So the
> variable now points to  a new memory location, and the old object has
> no more references other then itself. Am I right?

Well, the dict entry now holds a new value, and the old value has one reference 
fewer, which may be 0, in which case it’s garbage and can be cleaned up. It 
doesn’t hold a reference to itself (except in special cases, e.g., 
`self.circular = self` or `xs.append(xs)`).

In CPython, where values are PyObject* under the covers, the hash buckets in a 
dict include PyObject* slots for the key and value, and the dict’s __setitem__ 
takes care of incref’ing the stored value, and incref’ing the key if it’s new 
or dec’refing the old value if it’s a replacement. And CPython knows to delete 
an object as soon as a decref brings it to 0 refs. (What about fastlocals? The 
code to load and store variables to fastlocal slots does the same incref and 
decref stuff, but there’s no keys to worry about, because the compiler already 
turned them into indexes into an array. And an unbound local variable is a NULL 
in the array, as opposed to just not being in the dict. And if you want to dig 
into cells, they’re not much more complicated.)

But again, that’s all CPython specific. In, say, Jython, the hash buckets in a 
dict just have two Java objects for the key and value, which aren’t pointers 
(although under another set of covers your JVM is probably implemented in C 
using pointers all over the place), and nobody’s tracking refcounts; the JVM 
scans memory whenever it feels like it and deletes any objects (including 
Python ones) that aren’t referenced by anyone. This is why any optimizations 
like permutations reusing the same tuple if the refcount is 1 only make sense 
for CPython (and only from the C API rather than from Python itself).
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/J6FHWYJAXZV72SB4VUPKG3RKRULE4QQH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Make `del x` an expression evaluating to `x`

Reply via email to