On Fri, 09 Jan 2009 20:23:11 +0000, Mark Wooding wrote: > Steven D'Aprano <st...@remove-this-cybersource.com.au> wrote: > >> I'm pretty sure that no other pure-Python coder has manipulated >> references either. They've manipulated objects. > > No: not directly. The Python program deals solely with references; > anything involving actual objects is mediated by the runtime.
Your claim is ambiguous: when you say "the Python program", are you talking about the program I write using the Python language, or the Python VM, or something else? If the first, then you are wrong: the Python program I write doesn't deal with references. It deals with objects. This discussion flounders because we conflate multiple levels of explanation. People say "You do foo" when they mean "the Python VM does foo". Earlier, I responded to your claim that I was storing references by saying I was pretty sure I didn't store references, and gave an example of the line of Python code x=23. Your response was to mix explanatory levels: > You bind names to locations which store immediate representations. > Python IRs are (in the sense defined above) exclusively references. I most certainly don't bind names to locations. When I write x=23, I don't know what the location of the object 23 is, so how could I bind that location to the name? You are conflating what the Python VM does with what I do. What *I* do is bind the object 23 to the name x. I don't know the location of 23, I don't even know if 23 has a well-defined location or if it is some sort of distributed virtual data structure. As a Python programmer, that's the level I see: names and objects. At a lower level, what the Python VM does is store the name 'x' in a dictionary, bound to the object 23. No locations come into it, because at this level of explanation, dictionaries are an abstract mapping. There's no requirement that the abstract dictionary structure works by storing addresses. All we know is that it maps the name 'x' to the object 23 somehow. Maybe there's no persistent storage, and the dict stores instructions telling the VM how to recreate the object 23 when it is needed. Who knows? But at this explanatory level, there are no locatives involved. Names and objects float as disembodied entities in the aether, and dicts map one to the other. At an even lower explanatory level, the CPython implementation of dictionaries works by storing a pointer (or if you prefer, a reference) to the object in a hash table. Pointers, of course, are locatives, and so finally we come to the explanation you prefer. We've gone from abstract names-and-classes to concrete pointers-to-bytes. But this is at least two levels deeper than what's visible in Python code. Just about the only time Python coders work with locatives is when they manually calculate some index into a string or list, or similar. At an even lower explanatory level, all the VM does is copy bytes. And at a lower level still, it doesn't even copy bytes, it just flips bits. And below that, we're into physics, and I won't go there. I daresay you probably get annoyed at me when I bring up explanations at the level of copying bytes. You probably feel that for the average Python programmer, *most of the time* such explanations are more obfuscatory than useful. Of course, there are exceptions, such as explaining why repeated string concatenation is likely to be slow. There is a time and a place for such low level explanations, but not at the high-level overview needed by the average Python programmer. And you would be right. But I argue that your explanation at the level of references is exactly the same: it is too low level. It relies on specific details which may not even be true for all implementations of Python. It certainly relies on details which won't be true for hypothetical versions of Python running on exotic hardware. One can do massively parallel calculations using DNA, and such "DNA computers" are apparently Turing complete. I have no idea how one would write a Python virtual machine in such a biological computer, but I'm pretty sure that data values won't have well-defined locations in a machine that consists of billions of DNA molecules floating in a liquid. If that's too bizarre for you, think about simulating a Python VM in your own head. If we know one thing about the human brain, it is that thoughts and concepts are not stored in single, well-defined locations, so when you think of "x=23", there is no pointer to a location in your head. >> That's why we should try to keep the different layers of explanation >> separate, without conflating them. Python programmers don't actually >> flip bits, and neither do they manipulate references. Python >> programmers don't have access to bits, or references. What they have >> access to is objects. > > No, that's my point: Python programmers /don't/ have direct access to > objects. The objects themselves are kept at arm's length by the > indirection layer of references. I think you are wrong. If I want a name 'x' to refers to (is bound to) the object 23, I write x=23, not some variation of: create object 23 give me a reference to that object bind the reference to name 'x' Those three steps may take places at some level of the Python VM, but that's not what *I* do as a Python programmer. Note that what we're really doing is manipulating the symbol '23' in source code. Normally that makes no difference, but if you've ever tried to get the float 1.1 you'll discover the model (metaphor) of "source code symbols are programming entities" fails. All models fail sometimes. >> > Python does pass-by-value, but the things it passes -- by value -- >> > are references. >> >> If you're going to misuse pass-by-value to describe what Python does, >> *everything* is pass-by-value "where the value is foo", for some foo. > > No. I've tried explaining this before, with apparently little success. [...] > * A /value/ is an item of data. The range and nature of values is > language specific. Typically, values encompass at least some kinds > of numbers, textual data, and compound data structures; they may > also include behavioural items such as functions. Yes. This is an intuitive meaning of the world value. In Python, all values are objects. Some typical examples of values are: 5, None, "Fred", True, 3.5, [2, 3, 4], {}, lambda x: x+1 These (and more complicated structures built on top of them) are the things of interest to the programmer. They are the values: the things which are denoted by the symbols '5', 'None', '"Fred"' etc. > * A /location/[1] is an area of memory suitable for storing the > /immediate representation/ (which I shall abbreviate to /IR/) of a > value. (A location may be capable of storing things other than IRs, > e.g., representations of unevaluated expressions in lazily evaluated > languages. Locations may vary in size, e.g., in order to be capable > of storing different types of IRs.) At the level of Python code, we have no access to such locations. The closest we have is the id() function, which uses location in memory as a unique ID for objects, but this is an accident of the CPython implementation. Whatever the /immediate representation/ of a value is, we can't manipulate it directly in Python code. > * A /variable/ is a location to which has been /bound/ a name. Given > an occurrence of a name in a program's source, there is a language > specific rule for determining the variable to which it is bound. According to this definition, there are no variables in Python, because Python's data model is that names are an abstract mapping between symbols and values, not between symbols and locations. > * /Evaluation/ is the process of determining a value from an > expression. The /value of/ an expression is the result of > evaluating the expression. This value is, in general, dependent on > the contents of the locations to which names appearing in the > expression are bound. [...] > The argument passing model `pass-by-value' has a number of distinctive > properties. > > * The argument expression is fully evaluated before the function is > called, yielding an argument value. > > * The corresponding parameter name is bound to a fresh location. > > * The argument value IR is stored in the parameter's location. This is an underspecified definition. Without a definition of /immediate representation/, we can't determine what this means. I can guess that, based on Pascal, Fortran and C, the /immediate representation/ of a value is whatever data structure represents that value. However, I fear that you are going to try to slip in an open-ended definition, that /immediate representation/ could be *anything* -- for ints in C, it will be the bytes that represent the int; for C arrays, it will be a pointer to the bytes that represent the array; for Python, it will be references to objects; for Algol 60, it will be thunks. To avoid weakening pass-by-value to mean everything and anything at all, I'm going to say that the /immediate representation/ is the bytes which represent the value. (That is, the value itself.) Given this, we can see that Python is not pass-by-value. As I have shown in another post, replying to Joe, the location (as exposed by the id() function in CPython) of the formal parameter is the same as that of the argument value, not a fresh location with a copy of the value. To save you looking up my post, here's a simple example: >>> def function(parrot): ... return id(parrot) ... >>> spam = 23 >>> print id(spam), function(spam) 143599192 143599192 > By contrast, the `pass-by-reference' model has other distinguishing > properties. > > * Whether arbitrary argument expressions are permitted is language > dependent; often, only a subset of available expressions -- those > that designate locations -- are permitted. As you said above: "The /value of/ an expression is the result of evaluating the expression". Given the expression 2+3, the result of that expression is 5, not the location where 5 is stored. There is no reason to believe that 5 designates a location, as opposed to designating the number of peas in a pod or the average length of a piece of string. For want of a better description, let me re-word the above to say: * Whether arbitrary argument expressions are permitted is language dependent; often, only a subset of available expressions -- e.g. those that evaluate at a named location -- are permitted. Note that I say they evaluate *at*, not *to*, a fixed location. A practical example, to ensure we're talking about the same thing. In Pascal, I can declare a procedure swap(a, b) taking two VAR parameters, which use call-by-reference semantics. I might do something like this: a := 8; { number of peas in a pod } b := 13; { a baker's dozen } swap(a, b); Even though the values of a and b do not designate locations, the compiler can pass them to the procedure because the named variables a and b exist *at* particular locations. Contrast this with: swap(a, 10+3); which will fail in Pascal, because the value of the expression 10+3 doesn't correspond to a named location. (Presumably this is a design choice, because the value of the expression will certainly exist at a known location, although possibly not known until runtime.) > If the argument > expression does designate a location, then this location is the > /argument location/. Replace this with "If the argument expression does evaluate at an allowed location (named location for Pascal), then..." and I will accept it. > If arbitrary expressions are permitted, and > the expression does not designate a location, then a fresh location > is allocated to be the argument location, the expression evaluated, > and the resulting IR stored in the argument location. Modulo similar changes, accepted. > * The corresponding parameter name is bound to the argument location. According to this definition, Python is call-by-reference. Refer my code snippet above. But clearly Python doesn't behave like call-by-reference in other languages: you can't write a swap() procedure. This is where I quote Barbara Liskov, talking about the language CLU which has precisely the same calling semantics as Python: "In particular it is not call by value because mutations of arguments performed by the called routine will be visible to the caller. And it is not call by reference because access is not given to the variables of the caller, but merely to certain objects." http://coding.derkeiler.com/Archive/Python/comp.lang.python/2008-11/ msg01499.html or http://snipurl.com/9qd0b > There are other models, including value/return and call-by-name. And Python's call-by-object (also CLU, Ruby, Java -- although Java people don't use the term -- and others). [...] > I hope that I have convincingly demonstrated that it's possible to > define `pass-by-value' in a coherent manner, consistent with > conventional usage, and distinguishing it clearly from `pass-by- > reference'. Of course you can define pass-by-value coherently, but not if the definition of value can be anything you like. Once you start declaring that a language is "pass-by-value, where the value is a Foo rather than the actual value", pass-by-value can be used to describe *anything*. Pass- by-reference becomes pass-by-value where the value is the location of the value. Pass-by-object is pass-by-value where the value is a reference to the object (your claim). And so forth. [...] >> > I agree with the comment about Pascal, but C is actually pretty >> > similar to Python here. C only does pass-by-value. >> >> Except for arrays. > > Even for those. C doesn't pass arrays at all; instead it passes > (programmer-visible) pointers. See other article. But you are conflating concepts again. The value of an array is the array: it's what the programmer asked for when he declared an array. See your own definition of value above: "A /value/ is an item of data." Given a symbol x which represents an array, C doesn't pass the value of x (the array). It passes a pointer (reference) to the value of x. This is not pass-by-value unless you define value so broadly that it could mean anything. Given the Pascal declaration procedure foo(x: array[1..1000] of char) You get pass-by-value semantics: when you pass an array to foo, the entire array is duplicated. Changes to x are not visible in the caller's array. C arrays do not behave like this with an equivalent declaration. Change the declaration to be VAR x, and using pass-by-reference semantics, and the array is *not* duplicated, and changes to x *are* visible to the caller. C's default handling of arrays is just like Pascal's call-by-reference semantics, not like pass-by-value. This is AFAIK unique in C to arrays. [...] >> In other words... C is call-by-value, and (according to you) Python is >> call-by-value, but they behaviour differently. > > And this is entirely due to the difference in their immediate > representations of values. Values are values. Regardless of whether you are using C or Pascal or Python, the value of 1+1 is 2, not some arbitrary memory location. I'm going to quote from Fredrik Lundh: "I'm not aware of any language where a reference to an object, rather than the *contents* of the object, is seen as the object's actual value. It's definitely not true for Python, at least." http://coding.derkeiler.com/Archive/Python/comp.lang.python/2008-11/ msg01341.html or http://snipurl.com/9qdze The viewpoint that values are references is bizarre and counter- intuitive, and it leaves us with no simple way of talking about the value of expressions in the sense that 2 is the value of the expression 1+1. -- Steven -- http://mail.python.org/mailman/listinfo/python-list