On Mon, 09 May 2011 12:52:27 +1200, Gregory Ewing wrote: > Steven D'Aprano wrote: > >> Since you haven't explained what you think is happening, I can only >> guess. > > Let me save you from guessing. I'm thinking of a piece of paper with a > little box on it and the name 'a' written beside it. There is an arrow > from that box to a bigger box. > > +-------------+ > +---+ | | > a | --+---------------->| | > +---+ | | > +-------------+ > > There is another little box labelled 'b'. After executing 'a = b', both > little boxes have arrows pointing to the same big box. [...] > In this model, a "reference" is an arrow. Manipulating references > consists of rubbing out the arrows and redrawing them differently.
All very good, but that's not what takes place at the level of Python code. It's all implementation. I think Hans Georg Schaathun made a good objection to the idea that "Python has references": In Pascal a pointer is a distinct data type, and you can have variables of a given type or of type pointer to that given type. That makes the pointer a concrete concept defined by the language. The same can't be said of "references" in Python. It's not part of Python the language, although it might be part of Python's implementation. > Also > in this model, a "variable" is a little box. It's *not* the same thing > as a name; a name is a label for a variable, not the variable itself. That's essentially the same model used when dealing with pointers. I've used it myself, programming in Pascal. The "little box" named a or b is the pointer variable, and the "big box" is the data that the pointer points to. It's not an awful model for Python: a name binding a = obj is equivalent to sticking a reference (a pointer?) in box a that points to obj. Certainly there are advantages to it. But one problem is, the model is ambiguous with b = a. You've drawn little boxes a and b both pointing to the big box (which I deleted for brevity). But surely, if a = 1234 creates a reference from a to the big box 1234, then b = a should create a reference from b to the box a? +-------------+ +---+ | | a | --+---------------->| | +---+ | | ^ +-------------+ | +-|-+ b | | | +---+ which is the reference (pointer) model as most people would recognise it. That's how it works in C and Pascal (well, at least with the appropriate type declarations). To get b pointing to the big box, you would need an explicit dereference: "b = whatever a points to" rather than "b = a". Of course, both of these concepts are models, which is another word for "lies" *wink*. Your model is closer to what the CPython implementation actually does, using actual C pointers, except of course you do need to dereference the pointers appropriately. One of my objections to it is not that it is wrong (all models are wrong) but that it will mislead some people to reason incorrectly about Python's behaviour, e.g. that b now points to the little box a, and therefore if you change what a points to, b will follow along. The whole "call by reference" thing. I suppose you might argue that you're not responsible for the misunderstandings of blinkered C coders *wink*, and there's something to that. But there's another objection... take, say, the line of Python code: n = len('hello world') I can identify the little box "n", which ends up pointing to the big box holding int 11; another little box "len", which points to a big box holding a function; and a third big box holding the string 'hello world'. But where is its little box? If len were written in pure Python, then *inside* len's namespace there would be a local little box for the argument. I expect that there is an analogous local little box for built-in functions too. But I don't care what happens inside len. What about outside len? Where's the little box pointing to 'hello world'? So it seems your model fails to deal with sufficiently anonymous objects. I say "sufficiently", because of course your model deals fine with objects without names inside, say, lists: the little box there is the list slot rather than a named entry in a namespace. It's not just literals that your model fails to deal with, it's any expression that isn't bound to a little box: n = len('hello world') + func(y) func(y) produces a new object, a big box. Where is the little box pointing to it? If we drop down an abstraction layer, we can see where objects live: >>> code = compile("n = len('hello world') + func(y)", '', 'single') >>> import dis >>> dis.dis(code) 1 0 LOAD_NAME 0 (len) 3 LOAD_CONST 0 ('hello world') 6 CALL_FUNCTION 1 9 LOAD_NAME 1 (func) 12 LOAD_NAME 2 (y) 15 CALL_FUNCTION 1 18 BINARY_ADD 19 STORE_NAME 3 (n) 22 LOAD_CONST 1 (None) 25 RETURN_VALUE Both the call to len and the call to func push their results onto the stack. There's no little box pointing to the result. There's no little box pointing to len, or y, or any of the others: there are just names and abstract operations LOAD_NAME and friends. Again, this is just an abstraction layer. Python objects can be huge, potentially hundreds of megabytes or gigabytes. No way are they being pushed and popped onto a stack, even if the virtual machine gives the illusion that they are. For practical reasons, there must be some sort of indirection. But that's implementation and not the VM's model. > It seems that you would prefer to eliminate the little boxes and arrows > and write the names directly beside the objects: > > +-------------+ > a | | > | | > b | | > +-------------+ > > +-------------+ > c | | > | | > | | > +-------------+ That's not a bad model. But again, it's incomplete, because it would suggest that the big box should be able to read its own name tags, which of course is impossible in Python. But I suppose one might say, if the tag is on the outside of the object, the object can't use introspection to see it, can it? But note that this is really your model in disguise: if you imagine the name tags are stuck on with a little piece of blutack, and you carefully pull on the name and stretch it away, you get a name sitting over here with a tiny thread of blutack attaching it to the big box over there, just like in your model. I actually prefer to keep the nature of the mapping between name and object abstract: names map to objects. Objects float around in space, wherever the interpreter happens to put them, and you can optionally give them names. Objects are dumb and don't know their own name, but the Python interpreter knows the names. Names are not the only way to ask the interpreter for an object: e.g. you can put them in a list, and ask for them by position. If people then ask, how does the interpreter know the names?, I can add more detail: names are actually strings in a namespace, which is usually nothing more than a dict. Oh, and inside functions, it's a bit more complicated still. And so on. There is a problem with my model of free-floating objects in space: it relies on objects being able to be in two places at once, even *inside* themselves (you can append a list to itself). If you hate that concept, you'll hate my model. But if you're a science fiction fan from way back, then you won't have any problem with the idea that objects can be inside themselves: http://www.youtube.com/watch?v=51JtuEa_OPc Remember: it's just a model, and all models are lies. Abstractions all leak. You can only chose where and how badly they break down. Now, that's a good challenge for your model. Little boxes only point to big boxes. So how do you model cycles, including lists that contain themselves? > But what would you do about lists? With little boxes and arrows, you can > draw a diagram like this: > > +---+ +---+ > a | --+----->| | +-------------+ > +---+ +---+ | | > | --+----->| | > +---+ | | > | | +-------------+ > +---+ > > (Here, the list is represented as a collection of variables. That's why > variables and names are not the same thing -- the elements of the list > don't have textual names.) But that's wrong! Names (little boxes) can't point to *slots in a list*, any more than they can point to other names! This doesn't work: >>> L = [None, 42, None] >>> a = L[0] >>> L[0] = 23 >>> print(a) # This doesn't work! 23 It's a pity that Python doesn't actually have references. Imagine how much time we'd save: all the unproductive time we spend arguing about whether Python has references, we could be fixing bugs caused by the use of references instead... > But without any little boxes or arrows, you can't represent the list > itself as a coherent object. You would have to go around and label > various objects with 'a[0]', 'a[1]', etc. > > +-------------+ > a[0] | | > | | > | | > +-------------+ > > +-------------+ > a[1] | | > | | > | | > +-------------+ > > This is not very satisfactory. If the binding of 'a' changes, you have > to hunt for all your a[i] labels, rub them out and rewrite them next to > different objects. It's hardly conducive to imparting a clear > understanding of what is going on, whereas the boxes-and-arrows model > makes it instantly obvious. But I wouldn't do it like that. I'd do it like this: 0 1 2 3 4 +--------+--------+--------+--------+--------+ a | | | | | | | | | | | | | | | | | | +--------+--------+--------+--------+--------+ which conveniently happens to be the way Python lists actually are set up. More or less. [...] > Finally, there's another benefit of considering a reference to be a > distinct entity in the data model. If you think of the little boxes as > being of a fixed size, just big enough to hold a reference, then it's > obvious that you can only bind it to *one* object at a time. Otherwise > it might appear that you could draw more than one arrow coming out of a > name, or write the same name next to more than one object. But that's pretty much an arbitrary restriction. Why are the boxes so small? Just because. Why can't you carefully tease the thread of blutack apart, into a bifurcated Y shaped thread? Just because. If objects can be in two places at once, why can't names? Just because. (Actually, because Guido decrees it so. Also because it would be kinda crazy to do otherwise. I can't think of any language that has a many:many mapping between names and values.) -- Steven -- http://mail.python.org/mailman/listinfo/python-list