On Oct 16, 2008, at 7:30 PM, Steven D'Aprano wrote:
However, 'bob' here really is a variable. It's a variable whose
value
(at the moment) is a reference to some object.
Traditionally, a "variable" is a named memory location.
Agreed.
The main objection I have to using "variable" to describe Python name/
value bindings is that it has connotations that will confuse
programmers
who are familiar with C-like languages. For example:
def inc(x):
x += 1
n = 1
inc(n)
assert n == 2
Why doesn't that work? This is completely mysterious to anyone
expecting
C-like variables.
Hmm... I'm not following you. That wouldn't work in C, either. 'x'
in 'inc' is a local variable; its value is just a copy of whatever
value you pass in. You can increment it all you want, and it won't
affect the original variable (if indeed it was a variable that the
value came from; it could be a literal or an expression or who knows
what else).
At this point people will often start confusing the issue by claiming
that "all Python variables are pointers", which is an *implementation
detail* in CPython but not in other implementations, like PyPy or
Jython.
I'm not claiming that -- and I'm trying to clarify, rather than
confuse the issue. (Of course if it turns out that my understanding
of Python is incorrect, then I'm hoping to uncover and correct that,
too.)
Or people will imagine that Python makes a copy of the variable when
you
call a function. That's not true, and in fact Python explicitly
promises
never to copy a value unless you explicitly tell it to
Now that IS mysterious. Doesn't calling a function add a frame to a
stack? And doesn't that necessitate copying in values for the
variables in that stack frame (such as 'x' above)? Of course we're
now delving into internal implementation details... but it sure
behaves as though this is exactly what it's doing (and is the same
thing every other language does, AFAIK).
but it seems to explain the above, at least until the programmer
starts *assuming* call-
by-value behaviour and discovers this:
def inc(alist):
alist += [1] # or alist.append(1) if you prefer
return alist
It's still call-by-value behavior. The value in this case is a list
reference. Using .append, or the += operator, modifies the list
referred to by that list reference. Compare that to:
def inc(alist):
alist = alist + [1]
return alist
where you are not modifying the list passed in, but instead creating a
new list, and storing a reference to that in local variable 'alist'.
The semantics here appear to be exactly the same as Java or REALbasic
or any other modern language: variables are variables, and parameters
are local variables with called by value, and it just so happens that
some values may be references to data on the heap.
Are functions call by value or call by reference???
(Answer: neither. They are call by name.)
I have no idea what that means. They're call by value as far as I can
tell. (Even if the value may happen to be a reference.)
Side question, for my own education: *does* Python have a "ByRef"
parameter mode?
I myself often talk about variables as shorthand. But it's a bad
habit,
because it is misleading to anyone who thinks they know how variables
behave, so when I catch myself doing it I fix it and talk about name
bindings.
Perhaps you have a funny idea of what people think about how variables
behave. I suspect that making up new terminology for perfectly
ordinary things (like Python variables) makes them more mysterious,
not less.
Of course, you're entitled to define "variable" any way you like, and
then insist that Python variables don't behave like variables in other
languages. Personally, I don't think that's helpful to anyone.
No, but if we define them in the standard way, and point out that
Python variables behave exactly like variables in other languages,
then that IS helpful.
Well, they are variables. I'm not quite grasping the difficulty
here...
unless perhaps you were (at first) thinking of the variables as
holding
the object values, rather than the object references.
But that surely is what almost everyone will think, almost all the
time.
Consider:
x = 5
y = x + 3
I'm pretty sure that nearly everyone will read it as "assign 5 to x,
then
add 3 to x and assign the result to y" instead of:
"assign a reference to the object 5 to x, then dereference x to get
the
object 5, add it to the object 3 giving the object 8, and assign a
reference to that result to y".
True. I have no reason to believe that, in the case of a number, the
value isn't the number itself. (Except for occasional claims that
"everything in Python is an object," but if that's literally true,
what are the observable implications?)
Of course that's what's really happening under the hood, and you can't
*properly* understand how Python behaves without understanding that.
But
I'm pretty sure few people think that way naturally, especially noobs.
In this sense I'm still a noob -- until a couple weeks ago, I hadn't
touched Python in over a decade. So I sure appreciate this
refresher. If numbers really are wrapped in objects, that's
surprising to me, and I'd like to learn about any cases where you can
actually observe this. (It's not apparent from the behavior of the +=
operator, for example... if they are objects, I would guess they are
immutable.)
But it's not at all surprising with lists and dicts and objects --
every modern language passes around references to those, rather than
the data themselves, because the data could be huge and is often
changing size all the time. Storing the values in a variable would
just be silly.
References are essentially like pointers, and learning pointers is
notoriously difficult for people.
Hmm... I bet you're over 30. :) So am I, for that matter, so I can
remember when people had to learn "pointers" and found it difficult.
But nowadays, the yoots are raised on Java, or are self-taught on
something like REALbasic or .NET (or Python). There aren't any
pointers, but only references, and the semantics are the same in all
those languages. Pointer difficulty is something that harkens back to
C/C++, and they're just not teaching that anymore, except to the EE
majors.
So, if the semantics are all the same, I think it's helpful to use the
standard terminology.
Python does a magnificent job of making
references easy, but it does so by almost always hiding the fact
that it
uses references under the hood. That's why talk about variables is so
seductive and dangerous: Python's behaviour is *usually* identical
to the
behaviour most newbies expect from a language with "variables".
You could be right, when it comes to numeric values -- if these are
immutable objects, then I can safely get by thinking of them as pure
values rather than references (which is what they are in RB, for
example). Strings are another such case: as immutable, you can safely
treat them as values, but it's comforting to know that you're not
incurring the penalty of copying a huge data buffer every time you
pass one to a function or assign it to another variable.
But with mutable objects, it is ordinary and expected that what you
have is a reference to the object, and you can tell this quite simply
by mutating the object in any way. Every modern language I know works
the same way, and I'd wager that the ones I don't know (e.g. Ruby)
also work that way. Python's a beautiful language, but I'm afraid
it's nothing special in this particular regard.
Best,
- Joe
--
http://mail.python.org/mailman/listinfo/python-list