Gabriel Genellina wrote: > En Thu, 03 Apr 2008 19:27:47 -0300, <[EMAIL PROTECTED]> escribió: > >> Hi all, >> >> I've been playing around with the identity function id() for different >> types of objects, and I think I understand its behaviour when it comes >> to objects like lists and tuples in which case an assignment r2 = r1 >> (r1 refers to an existing object) creates an alias r2 that refers to >> the same object as r1. In this case id(r1) == id(r2) (or, if you >> like: r1 is r2). However for r1, r2 assigned as follows: r1 = [1, 2, >> 3] and r2 = [1, 2, 3], (r1 is r2) is False, even if r1==r2, >> etc. ...this is all very well. Therefore, it seems that id(r) can be >> interpreted as the address of the object that 'r' refers to. >> >> My observations of its behaviour when comparing ints, floats and >> strings have raised some questions in my mind, though. Consider the >> following examples: >> >> ######################################################################### >> >> # (1) turns out to be true >> a = 10 >> b = 10 >> print a is b > > ...only because CPython happens to cache small integers and return always > the same object. Try again with 10000. This is just an optimization and > the actual range of cached integer, or whether they are cached at all, is > implementation (and version) dependent. > (As integers are immutable, the optimization *can* be done, but that > doesn't mean that all immutable objects are always shared). > >> # (2) turns out to be false >> f = 10.0 >> g = 10.0 >> print f is g > > Because the above optimization isn't used for floats. > The `is` operator checks object identity: whether both operands are the > very same object (*not* a copy, or being equal: the *same* object) > ("identity" is a primitive concept) > The only way to guarantee that you are talking of the same object, is > using a reference to a previously created object. That is: > > a = some_arbitrary_object > b = a > assert a is b > > The name `b` now refers to the same object as name `a`; the assertion > holds for whatever object it is. > > In other cases, like (1) and (2) above, the literals are just handy > constructors for int and float objects. You have two objects constructed > (a and b, f and g). Whether they are identical or not is not defined; they > might be the same, or not, depending on unknown factors that might include > the moon phase; both alternatives are valid Python. > >> # (3) checking if ids of all list elements are the same for different >> cases: >> >> a = 3*[1]; areAllElementsEqual([id(i) for i in a]) # True >> b = [1, 1, 1]; areAllElementsEqual([id(i) for i in b]) # True >> f = 3*[1.0]; areAllElementsEqual([id(i) for i in f]) # True >> g = [1.0, 1.0, 1.0]; areAllElementsEqual([id(i) for i in g]) # True >> g1 = [1.0, 1.0, 0.5+0.5]; areAllElementsEqual([id(i) for i in g1]) # >> False > > Again, this is implementation dependent. If you try with a different > Python version or a different implementation you may get other results - > and that doesn't mean that any of them is broken. > >> # (4) two equal floats defined inside a function body behave >> differently than case (1): >> >> def func(): >> f = 10.0 >> g = 10.0 >> return f is g >> >> print func() # True > > Another implementation detail related to co_consts. You shouldn't rely on > it. > >> I didn't mention any examples with strings; they behaved like ints >> with respect to their id properties for all the cases I tried. > > You didn't try hard enough :) > > py> x = "abc" > py> y = ''.join(x) > py> x == y > True > py> x is y > False > > Long strings behave like big integers: they aren't cached: > > py> x = "a rather long string, full of garbage. No, this isn't garbage, > just non > sense text to fill space." > py> y = "a rather long string, full of garbage. No, this isn't garbage, > just non > sense text to fill space." > py> x == y > True > py> x is y > False > > As always: you have two statements constructing two objects. Whether they > return the same object or not, it's not defined. > >> While I have no particular qualms about the behaviour, I have the >> following questions: >> >> 1) Which of the above behaviours are reliable? For example, does a1 = >> a2 for ints and strings always imply that a1 is a2? > > If you mean: > > a1 = something > a2 = a1 > a1 is a2 > > then, from my comments above, you should be able to answer: yes, always, > not restricted to ints and strings. > > If you mean: > > a1 = someliteral > a2 = someliteral > a1 is a2 > > then: no, it isn't guaranteed at all, nor even for small integers or > strings. > >> 2) From the programmer's perspective, are ids of ints, floats and >> string of any practical significance at all (since these types are >> immutable)? > > The same significance as id() of any other object... mostly, none, except > for debugging purposes. > >> 3) Does the behaviour of ids for lists and tuples of the same element >> (of type int, string and sometimes even float), imply that the tuple a >> = (1,) takes (nearly) the same storage space as a = 10000*(1,)? (What >> about a list, where elements can be changed at will?) > > That's a different thing. A tuple maintains only references to its > elements (as any other object in Python). The memory required for a tuple > (I'm talking of CPython exclusively) is: (a small header) + n * > sizeof(pointer). So the expression 10000*(anything,) will take more space > than the singleton (anything,) because the former requires space for 10000 > pointers and the latter just one. > > You have to take into account the memory for the elements themselves; but > in both cases there is a *single* object referenced, so it doesn't matter. > Note that it doesn't matter whether that single element is an integer, a > string, mutable or immutable object: it's always the same object, already > existing, and creating that 10000-uple just increments its reference count > by 10000. > > The situation is similar for lists, except that being mutable containers, > they're over-allocated (to have room for future expansion). So the list > [anything]*10000 has a size somewhat larger than 10000*sizeof(pointer); > its (only) element increments its reference count by 10000. > In fact all you can in truth say is that
a is b --> a == b The converse definitely not true. regards Steve -- Steve Holden +1 571 484 6266 +1 800 494 3119 Holden Web LLC http://www.holdenweb.com/ -- http://mail.python.org/mailman/listinfo/python-list