Re: Finding the instance reference of an object [long and probably boring]

Joe Strout Fri, 07 Nov 2008 07:49:22 -0800

On Nov 6, 2008, at 10:35 PM, Steve Holden wrote:

That's good to hear. Your arguments are sometimes pretty good, and

usually well made, but there's been far too much insistence on allsides

about being right and not enough on reaching agreement about how
Python's well-defined semantics for assignment and function calling
should best be described.


In other words, it's a classic communication problem.


That's a fair point.  I'll try to do better.

I must say I find it strange when people try to contradict myassertionthat Python names are references to objects, when the (no punintended)
reference implementation of the language uses "reference counting" to
track how many assignments have been made.


I agree.  It seems like we should be able to take that as a given.

So any argument that the language "doesn't have the concept of object
reference (in the sense of e.g. C++ reference)" is simply stating the

obvious: that Python has no way to declare reference variables. Iwould

argue myself that it has no need of such a mechanism precisely because
names are object references, and I'd like to hear counter-arguments.

Right. I think of it this way: every variable is an object reference;no special syntax needed for it because that's the only type ofvariable there is. (Just as with Java or .NET, when dealing with anyclass type; Python is just a little more extreme in that even simplethings like numbers are wrapped in objects.)

Note: I tried to say "name" above instead of "variable" but I couldn'tbring myself to do it -- "name" seems to generic to do that job. Lotsof things have names that are not variables: modules have names,classes have names, methods have names, and so do variables. If I say"name," an astute listener would reasonably say "name of what" -- andI don't want to have to say "name of some thing in a name space whichcan be flexibly associated with an object" when the simple term"variable" seems to work as well.

Well that's not true either. If I remember all the way back to my

computational science degree I seem to remember being taught thattherewas call by *simple reference*, which is what I understand you tomean.

Suppose I write the following on some not-quite-Python language:

lst = ['one', 'two', 'three']

index = 1

def foo(item, i):
  i = 2
  item = "ouch"

foo(lst[index], index)
...
With call by simple reference, after the call I would expect the
following conditions to be true:

index == 2
lst == ['one', 'ouch', 'three']

Yes, I guess so, though it would require that lst[index] evaluate toan lvalue to which the 'item' parameter could be an alias. (With thesecond parameter, 'i', the situation is more straightforward becauseyou're passing in a simple variable rather than a more complexexpression.)

With full call by reference, however, arguably the change to the value
of index would induce the post-conditions

index == 2
lst == ['one', 'two', 'ouch']

because the reference made by the first argument depends on thevalue of

a variable mutated inside the function call.

I confess that I've never heard of "call by simple reference" or "callby full reference" before. What you're describing in the second casesounds more like call by name to me.

But I think we can agree that neither of these behaviors describesPython.

Why the resistance to these simple and basic terms that apply toany OOP
language?
Ideally I'd like to see this discussion concluded without resorting to
democratic appeals. Otherwise, after all, we should all eat shit:sixty
billion flies can't possibly be wrong.

I think I could make a good argument that the nutritional needs offlies are different from those of humans. On the other hand, whatargument is there that the Python community should use its own uniqueterminology for concepts that apply equally well to other languages?Wouldn't communication be easier and smoother if we adopted standardterms for standard behavior?

What does "give a new name to an object" mean? I submit that itmeans
exactly the same thing as "assigns the name to refer to the object".
I normally internalize "x = 3" as meaning "store a reference to the
object 3 in the slot named x", and when I see "x" in an expression I
understand it to be a reference to some object, and that the valuewill
be used after dereferencing has taken place.


Works for me.

I've seen various descriptions of Python's name binding behavior in
terms of attaching Port-It notes bearing names to the objectsreference
by the names, and I have never found them convincing. The reason for
this is that names live in namespaces, whereas values live in someother
universe altogether (that I normally describe as "object space" to
beginners, though this is not a term you will come across in thepython
literature).

Agreed. That model implies that all names are global, and completelyfails to explain how one object might be named "x" and a completelydifferent object might also be "x" (albeit in a different namespace).I suppose your post-its could be color-coded by namespace, and thenyou could add additional warts and caveats and addendums to explainrecursion, or explain why you don't have to search all objects inexistence to find the right one every time a name is dereferenced, butthe whole thing seems like a house of cards to me.

So I see the Post-it as being attached to a portion of some
namespace, and that little fixed-size piece of object space being

attached by a piece of string to a specific object. Of course anyobject

can have many piece of string attached, and not all of them come from
names -- some of them come from container elements, for example.


Right.

There certainly is no difference in behavior that anyone has beenableto point out between what assignment does in Python, and whatassignmentdoes in RB, VB.NET, Java, or C++ (in the context of objectpointers, of
course).  If the behavior is the same, why should we make up our own
unique and different terminology for it?
One reason would be that in the other languages you have other choices
as well, so you need to distinguish between them. Python is simpler,andso I don't see us needing the terminological complexity required inthe
other contexts you name, for a start.

OK, that's a fair argument, and I do suspect this is a big part of it-- when your language clearly supports passing object references andother types by-ref and by-val, and you can easily demonstrate thedifference, then there is little temptation to claim that it doesn'tdo either one. But if your language supports only one of these, andyou have no choices about it and can't (within the language itself)compare and contrast that one against another, then it is easy to makeall sorts of claims about what that one is.

But getting back to your point: is the standard terminology reallymore complex than whatever else we can come up with?

Java messed up the whole deal by
having different kinds of objects as a sacrifice to run-time speed,
thereby breeding a whole generation of programmers with little clue
about these matters, and the .NET environment also has to resort to
"boxing" and "unboxing" from time to time. I say away with comparisons
to such horrendously complex issues. One of the reasons for Python's
continue march towards world domination (allow me my fantasies) is its
consistent simplicity. Those last two words would be my candidate for
the definition of "Pythonicity".

I'm with you there. To me, the consistent simplicity is exactly this:all variables are object references, and these are always passed byvalue.

- the parameters of a function are local names for the callarguments
Agreed; they're not aliases of the call arguments.

They are actually names local to the function namespace, containing
references to the arguments. Some of those arguments were provided as
names, in which case the local name contains a copy of the reference
bound to the name provided as an argument. This is, however, merely a
degenerate case of the general instance, in which an expression is
provided as an argument and evaluated, yielding (a reference to) an

object which is then bound to the parameter name in the localnamespace.


Quite right.

  (I guess 'pass by object' is a good name).


Well, I'm not sure why that would be.  What you've just described is
called "pass by value" in every other language.

Sigh. This surely can only be true if you insist that references are
themselves values. I hold that they are not.

Here's an example of the above, I guess. In a language that supportsintegers and doubles as simple types, stored directly in a variable,then it is an obvious generalization that in the case of an objecttype, the value is a reference to an object. (Then you can"dereference" such a value to get to the values stored within theobject.) It is the only simple and consistent description of such alanguage (which includes Java, RB, and .NET, as well as C++ if youconsider an object pointer equivalent to a reference in more modernlanguages.)

But Python doesn't have those simple types, so there is a temptationto try to skip this generalization and say that references are notvalues, but rather the values are the objects themselves (despite thedereferencing step that is still required to get any data out ofthem). Well, and of course in the case of immutable objects, there isvery little observable difference between references and values.

However, it seems to me that when you start denying that the value ofan object reference is a reference to an object, this is when you getled into a quagmire of contradictions. Perhaps I'm wrong and I justhaven't explored that path far enough, because it appears dark andcobwebby to my eyes. I will try to give it a chance.

It seems so transparent to me that the parameters are copies of thereferences passed as argumentsI find it difficult to understand how, or why, anyone wouldconceptualize it differently.

Now you seem to be saying the same thing I've been saying all along.But this really is called "pass by value" in at least RB, VB.NET, andJava. And that makes sense to me.

OK, so above you argue quite cogently that Python uses a reference-passing mechanism.


Yes, of course.

This make you insistence in the preceding paragraph on calling it"pass by value" a little stubborn.


Why?  Are you really meaning to insist that the RB/VB.NET example:

  Function GetAgeInDogYears(ByVal whom As Person) As Integer
    return whom.age * 7
  End Function

is not actually using a by-value parameter? Or that it's not passingan object reference?

Sigh again. You appear to want to have your cake and eat it. Youare, if
effect, saying "there are no values in Python, only references",
completely ignoring the fact that it is semantically impossible tohave
a reference without having something to *refer to*

Well of course. I'm pretty sure I've said repeatedly that Pythonvariables refer to objects on the heap. (Please replace "heap" with"object space" if you prefer.) I'm only saying that Python variablesdon't contain any other type of value than references -- no integersor doubles, for example. This is unlike the other languages underdiscussion (and may be at the root of the confusion).

(which we in the Python world, in our usual sloppy way, often call"a value").

Yes, and as long as we're agreed that this is only a sloppy shorthand,I'm OK with it (especially in the case of immutable objects, where thedistinction is irrelevant).

I suspect this may be at the root of our equally stubborn insistence
that calling this mechanism "pass by value" is inviting
misunderstanding. If we didn't want to eliminate misunderstanding we
would all have stopped replying to you long ago.

Ditto right back at you. :) So maybe here's the trouble: since allPython variables are references, there is no need to distinguishreference types from any other types (there aren't any other types).So, with the distinction gone, there is a strong temptation to glossover the fact that they are references at all, and try to say that thevariables directly contain their objects.

But it seems to me that this claim quickly breaks down -- even as yousaid yourself; you need instead some mental model that shows thevariables as pointing to (tied to via strings, associated via a lookuptable, or whatever) the objects, which exist in object space. Inother words, they're references.

But continuing to attempt to gloss over that fact, when you come toparameter passing, you're then stuck trying to avoid describing it ascall by value, since if you claim that what a variable contains is theobject itself, then that doesn't fit (since clearly the object itselfis not copied). You also have to describe the assignment operator asdifferent from all other languages, since clearly that's not copyingthe object either.

So you end up in this (to me, very strange) state where you're makingup new terms to describe the parameter behavior, and the assignmentbehavior, which behavior is exactly the same as any other modern OOPlanguage. It makes it (again, IMHO) all seem very much more complexand mysterious than it really is. And this all results inevitablyfrom trying to gloss over the fact that Python variables are references.

So, while I'm trying this path on for size (and will continue to mullit over further), please try on this approach: boldly admit thatthey're references, and embrace that fact. An assignment copies theRHS reference into the LHS variable, nothing more or less. Aparameter copies the argument reference into the formal parameter,nothing more or less. And all this is exactly the same as in anyother OOP language the reader is likely to know. Isn't that simple,clear, and far easier to explain?

Well, I started with Simula and SmallTalk back in 1973, so myexperiencemay be a bit light. Sorry about that. This terminology wasn't madeup by
Python beginners, but by the people who invented Python.

Was it? Has our BDFL weighed in on this terminology issue anywhere?So far, the only "official" words I've found related to thisdiscussion are the ones plainly admitting that Python uses references(which some in this thread seem to want to deny, though not you Steve).

I believe they did so on the grounds that it's easier for beginnersto understand
Python's semantics without having to reference too many similar in
theory but confusingly different in practice other environments.

I wonder if that could be tested systematically. Perhaps we couldround up 20 newbies, divide them into two groups of 10, give each onea 1-page explanation either based on passing object references by-value, or passing values sort-of-kind-of-by-reference, and then checktheir comprehension by predicting the output of some code snippets.That'd be very interesting. It's hard for me to believe that theglossing-over-references approach really is easier for anybody, butmaybe I'm wrong.

I would even argue that your confusion supports this argument. Your
understanding of Python is perfectly adequate, so get with the program
for Pete's sake!

In my case, my understanding of Python became clear only once Istopped listening to all the confusing descriptions here, and realizedthat Python is no different from other OOP languages I already knew.


Best,
- Joe

--
http://mail.python.org/mailman/listinfo/python-list

Re: Finding the instance reference of an object [long and probably boring]

Reply via email to