An oddity in list comparison and element assignment
The following script puzzles me. It creates two nested lists that compare identically. After identical element assignments, the lists are different. In one case, a single element is replaced. In the other, an entire column is replaced. --- ''' An oddity in the behavior of lists of lists. Occurs under Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on win32. Not tested on other platforms or builds. ''' a =[[[1,2],[1,2]],[[1,2],[1,2]]] b = [[range(1,3)]*2]*2 assert(a==b) print "Initially, python reports that the lists are equal" a[1][1]=[5] b[1][1]=[5] try: assert(a==b) except AssertionError: print "After identical element assignments, the lists are not equal" print "a is now ", a print "b is now ", b - Here's the output on my system. Initially, python reports that the lists are equal After identical element assignments, the lists are not equal a is now [[[1, 2], [1, 2]], [[1, 2], [5]]] b is now [[[1, 2], [5]], [[1, 2], [5]]] This seems contrary to one of my fundamental expectations, namely that objects which compare equally must remain equal after identical operations. I think what must be going on is that the 'b' list contains replicated references instead of copies of [range(1,3)]*2 . IMO, python's == operator should detect this difference in list structure since it leads to different behavior under element assignments. Mike Ellis -- http://mail.python.org/mailman/listinfo/python-list
Re: An oddity in list comparison and element assignment
Hi Alex, With all due respect to your well-deserved standing in the Python community, I'm not convinced that equality shouldn't imply invariance under identical operations. Perhaps the most fundamental notion is mathematics is that the left and right sides of an equation remain identical after any operation applied to both sides. Our experience of the physical world is similar. If I make identical modifications to the engines of two identical automobiles, I expect the difference in performance to be identical. If my expectation is met, I would assert that either the two vehicles were not identical to begin with or that my modifications were not performed identically. As to containers, would you say that envelope containing five $100 bills is the same as an envelope containing a single $100 bill and 4 xerox copies of it? If so, I'd like to engage in some envelope exchanges with you :-) As I see it, reference copying is a very useful performance and memory optimization. But I don't think it should undermine the validity of assert(a==b) as a predictor of invariance under identical operations. Cheers, Mike Ellis Alex Martelli wrote: > <[EMAIL PROTECTED]> wrote: > Wrong; equality does not imply any check on identity. You can consider > the definition of "list A equals list B" as: > > -- len(A) == len(B), AND, > -- for each valid index i, A[i] == B[i] > > This is an extremely natural definition of "equality" for containers: > "they have EQUAL items" [[in the same order, for containers for which > order is relevant]]. Nowhere in this extremely natural definition does > the IDENTITY of the items come into play. -- http://mail.python.org/mailman/listinfo/python-list
Re: An oddity in list comparison and element assignment
oops! last sentence of 2nd paragraph in previous message should read "If my expectation is NOT met ..." -- http://mail.python.org/mailman/listinfo/python-list
Re: An oddity in list comparison and element assignment
Hi Tim, In your example, a & b are references to the same object. I agree they should compare equally. But please note that a==b is True at every point in your example, even after the ValueError raised by b.remove(1). That's good consistent behavior. My original example is a little different. a and b never referred to the same object. They were containers created by different expressions with no object in common. The problem arises because the overloaded * operator makes row level copies of the lists in b. There's nothing wrong with that, but the fact remains that a and b are different in a very significant way. I agree with Alex that checking for this type of inequality is not a trivial programming exercise. It requires (at least) a parallel recursion that counts references with the containers being compared . At the same time, I know that much harder programming problems have been solved. Where I disagree with Alex is in the labeling of the existing behavior as 'natural'. Alternatively, it might make sense to disallow == for containers by raising a TypeError although that would eliminate a largely useful feature. Realistically, I know that Python is unlikely to adopt either alternative. It would probably break a lot of existing code. My point in the original post was to raise what I felt was a useful topic for discussion and to help others avoid a pitfall that cost me a couple of hours of head-scratching. By the way, I've been programming professionally for over 25 years and have used at least 30 different languages. During the past few years, Python has become my language of choice for almost everything because it helps me deliver more productivity and value to my clients. Cheers, Mike Tim Peters wrote: > Think about a simpler case: > > a = [1] > b = a > assert(a == b) > a.remove(1) > b.remove(1) > > Oops. The last line dies with an exception, despite that a==b at the > third statement and that ".remove(1)" is applied to both a and b. -- http://mail.python.org/mailman/listinfo/python-list
Re: An oddity in list comparison and element assignment
Yes. You stated it quite precisely. I believe l1==l2 should always return True and l1==l3 should always be False. (unless l3 is reassigned as l3=l1). Your idea of a separate operator for 'all elements have numerically equal values at the moment of comparision' is a good one. For want of a better name, it could be called DeepCopyEquality(a,b) and would be equivalent to a byte-by-byte comparison of two distinct regions in memory created by a deep copies of a and b. Cheers, Mike Jim Segrave wrote: > In article <[EMAIL PROTECTED]>, > <[EMAIL PROTECTED]> wrote: > >Hi Alex, > >With all due respect to your well-deserved standing in the Python > >community, I'm not convinced that equality shouldn't imply invariance > >under identical operations. > > > >Perhaps the most fundamental notion is mathematics is that the left and > >right sides of an equation remain identical after any operation applied > >to both sides. Our experience of the physical world is similar. If I > >make identical modifications to the engines of two identical > >automobiles, I expect the difference in performance to be identical. > >If my expectation is met, I would assert that either the two vehicles > >were not identical to begin with or that my modifications were not > >performed identically. > > > >As to containers, would you say that envelope containing five $100 > >bills is the same as an envelope containing a single $100 bill and 4 > >xerox copies of it? If so, I'd like to engage in some envelope > >exchanges with you :-) > > > >As I see it, reference copying is a very useful performance and memory > >optimization. But I don't think it should undermine the validity of > >assert(a==b) as a predictor of invariance under identical operations. > > I think you end up with a totally unwieldy definition of '==' for > containers, since you have to check for _identical_ aliasing to > whatever depth it might occur, and, if you insist on equality > modeling the physical world, two lists can only be equal if: > > for each corresponding element in the two lists, either the element is > a reference to the same underlying object or the corresponding > elements are references to objects which do not have and never will > have other references bound to them. > > For example: > > ra = ['a', 'a'] > rb = ['b', 'b'] > > l1 = [ra, rb] > l2 = [ra, rb] > > This will be equal by your definition and will preserve equality over > identical operations on l1 and l2 > > l3 = [ ['a', 'b'], ['a', 'b']] > > This list will be identical, under your definition, so long as we > don't have anyone doing anything to the references ra or rb. Your > equality test has to claim that l1 and l3 are not equal, since ra > could be changed and that's not an operation on l1 or l3 > > This also leaves out the complexity of analysiing nested structures - > if you have a tuple, containing tuples which contain lists, then are > those top level tuples 'equal' if there are aliases in the lists? How > many levels deep should an equality test go? > > Does the more common test, to see if the elements of a sequence are > identical at the time of comparision need a new operator or hand > coding, since most of the time programmers aren't interested in future > equality or not of the structures. > > -- > Jim Segrave ([EMAIL PROTECTED]) -- http://mail.python.org/mailman/listinfo/python-list
Re: An oddity in list comparison and element assignment
Considering the number of new programmers who get bit by automatic coercion, I wish Dennis Ritchie had made some different choices when he designed C. But then I doubt he ever dreamed it would become so wildly successful. Being a curmudgeon purist I'd actually prefer it if Python raised a TypeError on float vs integer comparisons. Cheers, Mike Kent Johnson wrote: > [EMAIL PROTECTED] wrote: > > Hi Alex, > > With all due respect to your well-deserved standing in the Python > > community, I'm not convinced that equality shouldn't imply invariance > > under identical operations. > > > > Perhaps the most fundamental notion is mathematics is that the left and > > right sides of an equation remain identical after any operation applied > > to both sides. Our experience of the physical world is similar. If I > > make identical modifications to the engines of two identical > > automobiles, I expect the difference in performance to be identical. > > If my expectation is met, I would assert that either the two vehicles > > were not identical to begin with or that my modifications were not > > performed identically. > > But programming is not mathematics and assignment is not an equation. > How about this: > > In [1]: a=3.0 > > In [2]: b=3 > > In [3]: a==b > Out[3]: True > > In [4]: a/2 == b/2 > Out[4]: False > > Kent -- http://mail.python.org/mailman/listinfo/python-list
Re: An oddity in list comparison and element assignment
Yes (unless I was testing the assertion that the second envelope did not contain counterfeits of the first) Scott David Daniels wrote: > Would you say that envelope containing five $100 bills is equal to > an envelope containing five $100 bills with different serial numbers? -- http://mail.python.org/mailman/listinfo/python-list
Re: An oddity in list comparison and element assignment
Truthfully, I wouldn't mind it at all. In Python, I frequently write things like i == int(f) or vice versa just to avoid subtle bugs that sometimes creep in when later modifications to code change the original assumptions. When working in C, I always set the compiler for maximum warnings and do my damndest to make them all go away. In the long run, time spent on rigorous coding always repays itself with interest in time saved debugging. Mike Erik Max Francis wrote: > [EMAIL PROTECTED] wrote: > > > With all due respect to your well-deserved standing in the Python > > community, I'm not convinced that equality shouldn't imply invariance > > under identical operations. > > Doo you really want > > 2 == 2.0 > > to be False? > > -- > Erik Max Francis && [EMAIL PROTECTED] && http://www.alcyone.com/max/ > San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis >Without victory there is no survival. >-- Winston Churchill -- http://mail.python.org/mailman/listinfo/python-list
Re: An oddity in list comparison and element assignment
I believe that 'is' tests equality of reference, such that >>> a = range(1,3) >>> b = range(1,3) >>> a is b False The 'is' operator tells you whether a and b refer to the same object. What I've been discussing is whether == should test for "structural" equality so that a and b remain equivalent under parallel mutations (and also under single mutations to common references) Cheers, Mike Maric Michaud wrote: > Le Jeudi 01 Juin 2006 18:00, [EMAIL PROTECTED] a écrit : > > Perhaps the most fundamental notion is mathematics is that the left and > > right sides of an equation remain identical after any operation applied > > to both sides. > > IMHO, you are not aware that the '=' symbol of mathematics exists in python, > it's the 'is' assertion. > > a is b > and then, do what you want with a (or b), a is b remains True. > > THIS is the meaning of expr1 = expr2, but in computer science, this is not as > important as it is in pure logic (most languages do not even provide the 'is' > assertion). > > -- > _ > > Maric Michaud > _ > > Aristote - www.aristote.info > 3 place des tapis > 69004 Lyon > Tel: +33 426 880 097 -- http://mail.python.org/mailman/listinfo/python-list
Re: An oddity in list comparison and element assignment
Perhaps a little background to my original post will defuse some of the controversy. While working during an airline flight, I ran into an unexpected outcome from using the * replication operator to initialize an array of lists. When I modified a single element of the array an entire column changed. Having no reference books or internet access available, I tried to understand what was going on by creating some small arrays on the command line to see if there was a difference between explicit initialization and initialization with range() and the * operator. The arrays looked identical when printed and a == b returned True. Yet the arrays were clearly not equivalent because mutating the corresponding elements produced different outcomes. I put the problem aside until the next day when I looked at it some more and and created the example script I posted. Just as I was about to hit the Send button, I realized that the * operator must have been creating references instead of copies. And then I appended the now much debated opinion that == should have detected the difference. (As an aside, may I point out that Python In A Nutshell states on page 46 "The result of S*n or n*S is the concatenation of n copies of S". It might be well to put a warning in a future edition that this is not strictly the case.) My viewpoint is that of a working professional software consultant. I'm essentially a pragmatist with no strong 'religious' views about languages and methodologies. As I noted in an earlier reply, I don't realistically expect Python to change the behavior of the == operator. I do think that a problem arose when it was adopted from C and extended to allow comparison of containers. In C, you can use it to compare integers, floats, and pointers and everyone understands that p==q does not imply *p == *q. Moreover, compilers issue warnings about comparisons between different types. Basically, I'm looking for simple diagnostic tools that make it easy to understand what's really going on when code produces an unexpected result. A 'strengthened equivalence' operator, to use your terminology would have been useful to me. As to constructing pseudocode for such an operator, I've appended a working script below. The counterexamples and questions from Slawomir, Maric, and Jim were really useful in sharpening my thinking about the matter. I'm sure there are many ways to break it. For example, tuples have no index method, so one would have to be written. Still, I hope it will serve to move the discussion beyond terms like 'crazy' and 'handwaving' and 'ill-founded'. I haven't used such perjoratives in any of my posts and would appreciate the same courtesy. Cheers, Mike ''' StrongEquality -- a first cut at the definition proposed by M. Ellis. Author: Michael F. Ellis, Ellis & Grant, Inc. ''' def indices(item,seq): '''Utility function that returns a list of indices where item occurs in seq''' result=[] for i in xrange(len(seq)): try: result.append(i+seq[i:].index(item)) except ValueError: return result def StrongEquality(a,b): '''True if a and b are numerically and "structurally" equal''' if a is b: return True if a != b: return False ## At this point we know a and b have the same length and ## evaluate numerically equivalent. We now need to figure out ## whether there are any references to identical objects in non-corresponding ## positions of a & b (per Slawomir's example). We also need to inspect ## a and b for non-matching patterns of identical references (per my example) ida=[] ; idb=[] for i in xrange(len(a)): if a[i] is b[i]: continue if isinstance(a[i], (int, float, str)) and isinstance(b[i], (int, float, str)): continue## we already know they're numerically equal ida.append(id(a[i])) idb.append(id(b[i])) ## We know that ida[n] is not idb[n] for all n because we omitted all ## cases where a is b. Therefore Slawomir's example is detected if ## any id appears in both lists. for n in ida: if n in idb: return False ## Next we test for my example. I'm sure this can be coded more ## more elegantly ... for j in xrange(len(ida)): if indices(ida[j],ida) != indices(idb[j],idb): return False ## Lastly, recurse ... if not StrongEquality(a[i],b[i]): return False return True if __name__=='__main__': ## Rudimentary test cases assert StrongEquality(1,1) assert not StrongEquality(0,1) ## Slawomir's example x, y, z = [1],[1],[1] a, b = [x,y], [y,z] c, d = [[1],[1]], [[1],[1]] asser
Re: An oddity in list comparison and element assignment
Hey Alex, lighten up! Python is a programming language -- not your family, religion, or civil rights. Cheers, Mike Alex Martelli wrote: > <[EMAIL PROTECTED]> wrote: >... > > (As an aside, may I point out that Python In A Nutshell states on page > > 46 "The result of S*n or n*S is the concatenation of n copies of S". It > > might be well to put a warning in a future edition that this is not > > strictly the case.) > > Can you give me an example where, say, for a sequence S, > > x = S * 3 > > is not structurally the same as > > x = copy.copy(S) + copy.copy(S) + copy.copy(S) > > ...? That is, where the "* 3" on a sequence is NOT the concatenation of > three copies (ordinary copies, of course!) of that sequence? I don't > think you can... and I can't repeatedly explain or point to the > distinction between normal, ordinary, shallow copies on one side, and > "deep copies" on the other, every single time in which that distinction > MIGHT be relevant (because some reader might not be aware of it); such > endless repetition would bloat the Nutshell totally away from its role > as a CONCISE desktop reference, and seriously hamper its usefulness > (particularly by DESTROYING any trace of usefulness for anybody who's > finally *GOT* this crucial bit, but not just in that way). > > > > languages and methodologies. As I noted in an earlier reply, I don't > > realistically expect Python to change the behavior of the == operator. > > Then you might have avoided trying to convince anybody, or even trying > to IMPLY, that in an ideal version of Python == *SHOULD* behave your way > -- Python's semantics *ARE* entirely up for rediscussion at the moment, > with an eye on the future "Python 3000" release, so this is one of the > very rare periods of the history of the language where backwards > incompatibility of a potential change is _NOT_ a blocking point. > > By asserting that your version of == would be "more natural", and trying > to defend that assertion by vague handwaving references to maths and > "real world", you managed to entirely shift MY mindstate (and possibly > that of several other discussants) into one of total and absolute > opposition to the proposal -- having by now spent considerable time and > energy pondering and debating the issue, I am now entirely convinced > that a language with such an == operator instead of Python's current one > would be such a total, unadulterated disaster that I would refuse to use > that language, no matter what other "good" features it might present to > me. I've walked away from great jobs, just because they would have > required me to use some technology I just could not stand, more than > once already in my life: and this IS how strongly (and negatively) I > feel about your claim that, for built-in ==, your semantics would be in > any way preferable to Python's. > > By managing to thus focus my mindset (and make me spend my energy and > time) in opposition to your claims of "more natural", you have at least > managed to ensure that I will not now lend any scrap of help or support > to your debugging needs. If you were as pragmatic as you claim to be, > this kind of consideration WOULD have weighed highly in your choices. > > I.ie., if you had WANTED to attract any such support and help, a > completely different attitude than that "most natural" claim would have > been ENORMOUSLY more productive -- and your continuing attempts to > debate that issue aren't helping AT ALL either: > > > I do think that a problem arose when it was adopted from C and extended > > to allow comparison of containers. In C, you can use it to compare > > integers, floats, and pointers and everyone understands that p==q does > > not imply *p == *q. > > If that is so, then everyone is utterly, totally, entirely, horribly > *WRONG*, because, in C, p==q ***DOES*** imply *p == *q (whenever p -- > and by consequence q, given equality -- may legitimately be > dereferenced: p == q == 0 does not imply anything about *p and/or *q, > which may produce random results, crash the process, or whatever -- of > course). > > You no doubt meant to say something entirely different from what you > ACTUALLY said, but I respectfully suggest you spare your breath rather > than keep trying to defend an indefensible position. > > I do NOT agree, and I cannot imagine any state of the world that would > get me to agree, with your claim that "a problem arose" by allowing > equality comparison of containers in Python (many other languages allow > such comparisons, BTW; I would consider it a horrible wart if a language > claiming to be "higher level" did NOT). That you're "supporting" (HA!) > your absurd claim with an equally absurd (and obviously, wholly false, > unfounded, and misplaced) claim about C pointers doesn't "help", of > course, but even perfectly accurate claims about C (or machine code, or > Cobol, or RPG...) would be pretty much uninteresting and irrelevant. > > > Moreover, compilers issue warnings abou