On Oct 16, 2008, at 7:30 PM, Steven D'Aprano wrote:

However, 'bob' here really is a variable. It's a variable whose value
(at the moment) is a reference to some object.

Traditionally, a "variable" is a named memory location.

Agreed.

The main objection I have to using "variable" to describe Python name/
value bindings is that it has connotations that will confuse programmers
who are familiar with C-like languages. For example:

def inc(x):
   x += 1

n = 1
inc(n)
assert n == 2

Why doesn't that work? This is completely mysterious to anyone expecting
C-like variables.

Hmm... I'm not following you. That wouldn't work in C, either. 'x' in 'inc' is a local variable; its value is just a copy of whatever value you pass in. You can increment it all you want, and it won't affect the original variable (if indeed it was a variable that the value came from; it could be a literal or an expression or who knows what else).

At this point people will often start confusing the issue by claiming
that "all Python variables are pointers", which is an *implementation
detail* in CPython but not in other implementations, like PyPy or Jython.

I'm not claiming that -- and I'm trying to clarify, rather than confuse the issue. (Of course if it turns out that my understanding of Python is incorrect, then I'm hoping to uncover and correct that, too.)

Or people will imagine that Python makes a copy of the variable when you call a function. That's not true, and in fact Python explicitly promises
never to copy a value unless you explicitly tell it to

Now that IS mysterious. Doesn't calling a function add a frame to a stack? And doesn't that necessitate copying in values for the variables in that stack frame (such as 'x' above)? Of course we're now delving into internal implementation details... but it sure behaves as though this is exactly what it's doing (and is the same thing every other language does, AFAIK).

but it seems to explain the above, at least until the programmer starts *assuming* call-
by-value behaviour and discovers this:

def inc(alist):
   alist += [1]  # or alist.append(1) if you prefer
   return alist

It's still call-by-value behavior. The value in this case is a list reference. Using .append, or the += operator, modifies the list referred to by that list reference. Compare that to:

 def inc(alist):
    alist = alist + [1]
    return alist

where you are not modifying the list passed in, but instead creating a new list, and storing a reference to that in local variable 'alist'.

The semantics here appear to be exactly the same as Java or REALbasic or any other modern language: variables are variables, and parameters are local variables with called by value, and it just so happens that some values may be references to data on the heap.

Are functions call by value or call by reference???

(Answer: neither. They are call by name.)

I have no idea what that means. They're call by value as far as I can tell. (Even if the value may happen to be a reference.)

Side question, for my own education: *does* Python have a "ByRef" parameter mode?

I myself often talk about variables as shorthand. But it's a bad habit,
because it is misleading to anyone who thinks they know how variables
behave, so when I catch myself doing it I fix it and talk about name
bindings.

Perhaps you have a funny idea of what people think about how variables behave. I suspect that making up new terminology for perfectly ordinary things (like Python variables) makes them more mysterious, not less.

Of course, you're entitled to define "variable" any way you like, and
then insist that Python variables don't behave like variables in other
languages. Personally, I don't think that's helpful to anyone.

No, but if we define them in the standard way, and point out that Python variables behave exactly like variables in other languages, then that IS helpful.

Well, they are variables. I'm not quite grasping the difficulty here... unless perhaps you were (at first) thinking of the variables as holding
the object values, rather than the object references.

But that surely is what almost everyone will think, almost all the time.
Consider:

x = 5
y = x + 3

I'm pretty sure that nearly everyone will read it as "assign 5 to x, then
add 3 to x and assign the result to y" instead of:

"assign a reference to the object 5 to x, then dereference x to get the
object 5, add it to the object 3 giving the object 8, and assign a
reference to that result to y".

True. I have no reason to believe that, in the case of a number, the value isn't the number itself. (Except for occasional claims that "everything in Python is an object," but if that's literally true, what are the observable implications?)

Of course that's what's really happening under the hood, and you can't
*properly* understand how Python behaves without understanding that. But
I'm pretty sure few people think that way naturally, especially noobs.

In this sense I'm still a noob -- until a couple weeks ago, I hadn't touched Python in over a decade. So I sure appreciate this refresher. If numbers really are wrapped in objects, that's surprising to me, and I'd like to learn about any cases where you can actually observe this. (It's not apparent from the behavior of the += operator, for example... if they are objects, I would guess they are immutable.)

But it's not at all surprising with lists and dicts and objects -- every modern language passes around references to those, rather than the data themselves, because the data could be huge and is often changing size all the time. Storing the values in a variable would just be silly.

References are essentially like pointers, and learning pointers is
notoriously difficult for people.

Hmm... I bet you're over 30. :) So am I, for that matter, so I can remember when people had to learn "pointers" and found it difficult. But nowadays, the yoots are raised on Java, or are self-taught on something like REALbasic or .NET (or Python). There aren't any pointers, but only references, and the semantics are the same in all those languages. Pointer difficulty is something that harkens back to C/C++, and they're just not teaching that anymore, except to the EE majors.

So, if the semantics are all the same, I think it's helpful to use the standard terminology.

Python does a magnificent job of making
references easy, but it does so by almost always hiding the fact that it
uses references under the hood. That's why talk about variables is so
seductive and dangerous: Python's behaviour is *usually* identical to the
behaviour most newbies expect from a language with "variables".

You could be right, when it comes to numeric values -- if these are immutable objects, then I can safely get by thinking of them as pure values rather than references (which is what they are in RB, for example). Strings are another such case: as immutable, you can safely treat them as values, but it's comforting to know that you're not incurring the penalty of copying a huge data buffer every time you pass one to a function or assign it to another variable.

But with mutable objects, it is ordinary and expected that what you have is a reference to the object, and you can tell this quite simply by mutating the object in any way. Every modern language I know works the same way, and I'd wager that the ones I don't know (e.g. Ruby) also work that way. Python's a beautiful language, but I'm afraid it's nothing special in this particular regard.

Best,
- Joe

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to