An oddity in list comparison and element assignment

2006-06-01 Thread michael . f . ellis
The following script puzzles me.  It creates two nested lists that
compare identically.  After identical element assignments, the lists
are different.  In one case, a single element is replaced. In the
other, an entire column is replaced.

---

'''
An oddity in the behavior of lists of lists. Occurs under
Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)]
on win32.
Not tested on other platforms or builds.
'''
a =[[[1,2],[1,2]],[[1,2],[1,2]]]
b = [[range(1,3)]*2]*2
assert(a==b)
print "Initially, python reports that the lists are equal"
a[1][1]=[5]
b[1][1]=[5]
try:
  assert(a==b)
except AssertionError:
  print "After identical element assignments, the lists are not equal"
  print "a is now ", a
  print "b is now ", b
-

Here's the output on my system.


Initially, python reports that the lists are equal
After identical element assignments, the lists are not equal
a is now  [[[1, 2], [1, 2]], [[1, 2], [5]]]
b is now  [[[1, 2], [5]], [[1, 2], [5]]]


This seems contrary to one of my fundamental expectations, namely that
objects which compare equally must remain equal after identical
operations.  I think what must be going on is that the 'b' list
contains replicated references instead of copies of [range(1,3)]*2 .
IMO, python's == operator should detect this difference in list
structure since it leads to different behavior under element
assignments.

Mike Ellis

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An oddity in list comparison and element assignment

2006-06-01 Thread michael . f . ellis
Hi Alex,
With all due respect to your well-deserved standing in the Python
community, I'm not convinced that equality shouldn't imply invariance
under identical operations.

Perhaps the most fundamental notion is mathematics is that the left and
right sides of an equation remain identical after any operation applied
to both sides.  Our experience of the physical world is similar.  If I
make identical modifications to the engines of two identical
automobiles, I expect the difference in performance to be identical.
If my expectation is met, I would assert that either the two vehicles
were not identical to begin with or that my modifications were not
performed identically.

As to containers,  would you say that envelope containing five $100
bills is the same as an envelope containing a single $100 bill and 4
xerox copies of it? If so, I'd like to engage in some envelope
exchanges with you :-)

As I see it, reference copying is a very useful performance and memory
optimization.  But I don't think it should undermine the validity of
assert(a==b) as a predictor of invariance under identical operations.

Cheers,
Mike Ellis


Alex Martelli wrote:
> <[EMAIL PROTECTED]> wrote:
> Wrong; equality does not imply any check on identity.  You can consider
> the definition of "list A equals list B" as:
>
> -- len(A) == len(B), AND,
> -- for each valid index i, A[i] == B[i]
>
> This is an extremely natural definition of "equality" for containers:
> "they have EQUAL items" [[in the same order, for containers for which
> order is relevant]].  Nowhere in this extremely natural definition does
> the IDENTITY of the items come into play.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An oddity in list comparison and element assignment

2006-06-01 Thread michael . f . ellis
oops! last sentence of 2nd paragraph in previous message should read
"If my expectation is NOT met ..."

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An oddity in list comparison and element assignment

2006-06-01 Thread michael . f . ellis
Hi Tim,
In your example, a & b are references to the same object.  I agree they
should compare equally.  But please note that a==b is True at every
point in your example, even after the ValueError raised by b.remove(1).
 That's good consistent behavior.

My original example is a little different.  a and b never referred to
the same object.  They were containers created by different expressions
with no object in common.  The problem arises because the overloaded *
operator makes row level copies of the lists in b.  There's nothing
wrong with that, but the fact remains that a and b are different in a
very significant way.

I agree with Alex that checking for this type of inequality is not a
trivial programming exercise.  It requires (at least) a parallel
recursion that counts references with the containers being compared .
At the same time, I know that much harder programming problems have
been solved.  Where I disagree with Alex is in the labeling of the
existing behavior as 'natural'.

Alternatively, it might make sense to disallow == for containers by
raising a TypeError although that would eliminate a largely useful
feature.

Realistically, I know that Python is unlikely to adopt either
alternative.  It would probably break a lot of existing code.   My
point in the original post was to raise what I felt was a useful topic
for discussion and to help others avoid a pitfall that cost me a couple
of hours of head-scratching.

By the way, I've been programming professionally for over 25 years and
have used at least 30 different languages.  During the past few years,
Python has become my language of choice for almost everything because
it helps me deliver more productivity and value to my clients.

Cheers,
Mike


Tim Peters wrote:
> Think about a simpler case:
>
> a = [1]
> b = a
> assert(a == b)
> a.remove(1)
> b.remove(1)
>
> Oops.  The last line dies with an exception, despite that a==b at the
> third statement and that ".remove(1)" is applied to both a and b.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An oddity in list comparison and element assignment

2006-06-01 Thread michael . f . ellis
Yes. You stated it quite precisely.  I believe l1==l2 should always
return True and l1==l3 should always be False. (unless l3 is reassigned
as l3=l1).  Your idea of a separate operator for 'all elements have
numerically equal values at the moment of comparision' is a good one.
For want of a better name, it could be called DeepCopyEquality(a,b) and
would be equivalent to a byte-by-byte comparison of two distinct
regions in memory created by a deep copies of a and b.

Cheers,
Mike

Jim Segrave wrote:
> In article <[EMAIL PROTECTED]>,
>  <[EMAIL PROTECTED]> wrote:
> >Hi Alex,
> >With all due respect to your well-deserved standing in the Python
> >community, I'm not convinced that equality shouldn't imply invariance
> >under identical operations.
> >
> >Perhaps the most fundamental notion is mathematics is that the left and
> >right sides of an equation remain identical after any operation applied
> >to both sides.  Our experience of the physical world is similar.  If I
> >make identical modifications to the engines of two identical
> >automobiles, I expect the difference in performance to be identical.
> >If my expectation is met, I would assert that either the two vehicles
> >were not identical to begin with or that my modifications were not
> >performed identically.
> >
> >As to containers,  would you say that envelope containing five $100
> >bills is the same as an envelope containing a single $100 bill and 4
> >xerox copies of it? If so, I'd like to engage in some envelope
> >exchanges with you :-)
> >
> >As I see it, reference copying is a very useful performance and memory
> >optimization.  But I don't think it should undermine the validity of
> >assert(a==b) as a predictor of invariance under identical operations.
>
> I think you end up with a totally unwieldy definition of '==' for
> containers, since you have to check for _identical_ aliasing to
> whatever depth it might occur, and, if you insist on equality
> modeling the physical world, two lists can only be equal if:
>
> for each corresponding element in the two lists, either the element is
> a reference to the same underlying object or the corresponding
> elements are references to objects which do not have and never will
> have other references bound to them.
>
> For example:
>
> ra = ['a', 'a']
> rb = ['b', 'b']
>
> l1 = [ra, rb]
> l2 = [ra, rb]
>
> This will be equal by your definition and will preserve equality over
> identical operations on l1 and l2
>
> l3 = [ ['a', 'b'], ['a', 'b']]
>
> This list will be identical, under your definition, so long as we
> don't have anyone doing anything to the references ra or rb. Your
> equality test has to claim that l1 and l3 are not equal, since ra
> could be changed and that's not an operation on l1 or l3
>
> This also leaves out the complexity of analysiing nested structures -
> if you have a tuple, containing tuples which contain lists, then are
> those top level tuples 'equal' if there are aliases in the lists? How
> many levels deep should an equality test go?
>
> Does the more common test, to see if the elements of a sequence are
> identical at the time of comparision need a new operator or hand
> coding, since most of the time programmers aren't interested in future
> equality or not of the structures.
> 
> -- 
> Jim Segrave   ([EMAIL PROTECTED])

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An oddity in list comparison and element assignment

2006-06-01 Thread michael . f . ellis
Considering the number of new programmers who get bit by automatic
coercion, I wish Dennis Ritchie had made some different choices when he
designed C.  But then I doubt he ever dreamed it would become so wildly
successful.

Being a curmudgeon purist I'd actually prefer it if Python raised a
TypeError on float vs integer comparisons.

Cheers,
Mike

Kent Johnson wrote:
> [EMAIL PROTECTED] wrote:
> > Hi Alex,
> > With all due respect to your well-deserved standing in the Python
> > community, I'm not convinced that equality shouldn't imply invariance
> > under identical operations.
> >
> > Perhaps the most fundamental notion is mathematics is that the left and
> > right sides of an equation remain identical after any operation applied
> > to both sides.  Our experience of the physical world is similar.  If I
> > make identical modifications to the engines of two identical
> > automobiles, I expect the difference in performance to be identical.
> > If my expectation is met, I would assert that either the two vehicles
> > were not identical to begin with or that my modifications were not
> > performed identically.
>
> But programming is not mathematics and assignment is not an equation.
> How about this:
>
> In [1]: a=3.0
>
> In [2]: b=3
>
> In [3]: a==b
> Out[3]: True
> 
> In [4]: a/2 == b/2
> Out[4]: False
> 
> Kent

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An oddity in list comparison and element assignment

2006-06-01 Thread michael . f . ellis
Yes (unless I was testing the assertion that the second envelope did
not contain counterfeits of the first)

Scott David Daniels wrote:
> Would you say that envelope containing five $100 bills is equal to
> an envelope containing five $100 bills with different serial numbers?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An oddity in list comparison and element assignment

2006-06-01 Thread michael . f . ellis
Truthfully, I wouldn't mind it at all.  In Python, I frequently write
things like
   i == int(f)
or vice versa just to avoid subtle bugs that sometimes creep in when
later modifications to code change the original assumptions.

When working in C, I always set the compiler for maximum warnings and
do my damndest to make them all go away. In the long run, time spent on
rigorous coding always repays itself with interest in time saved
debugging.

Mike

Erik Max Francis wrote:
> [EMAIL PROTECTED] wrote:
>
> > With all due respect to your well-deserved standing in the Python
> > community, I'm not convinced that equality shouldn't imply invariance
> > under identical operations.
>
> Doo you really want
>
>   2 == 2.0
>
> to be False?
>
> --
> Erik Max Francis && [EMAIL PROTECTED] && http://www.alcyone.com/max/
> San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
>Without victory there is no survival.
>-- Winston Churchill

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An oddity in list comparison and element assignment

2006-06-01 Thread michael . f . ellis
I believe that 'is' tests equality of reference, such that

>>> a = range(1,3)
>>> b = range(1,3)
>>> a is b
False

The 'is' operator tells you whether a and b refer to the same object.
What I've been discussing is whether == should test for "structural"
equality so that a and b remain equivalent under parallel mutations
(and also under single mutations to common references)

Cheers,
Mike

Maric Michaud wrote:
> Le Jeudi 01 Juin 2006 18:00, [EMAIL PROTECTED] a écrit :
> > Perhaps the most fundamental notion is mathematics is that the left and
> > right sides of an equation remain identical after any operation applied
> > to both sides.
>
> IMHO, you are not aware that the '=' symbol of mathematics exists in python,
> it's the 'is' assertion.
>
> a is b
> and then, do what you want with a (or b), a is b remains True.
>
> THIS is the meaning of expr1 = expr2, but in computer science, this is not as
> important as it is in pure logic (most languages do not even provide the 'is'
> assertion).
>
> --
> _
>
> Maric Michaud
> _
>
> Aristote - www.aristote.info
> 3 place des tapis
> 69004 Lyon
> Tel: +33 426 880 097

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: An oddity in list comparison and element assignment

2006-06-02 Thread michael . f . ellis
Perhaps a little background to my original post will defuse some of the
controversy.  While working during an airline flight, I ran into an
unexpected outcome from using the * replication operator to initialize
an array of lists.  When I modified a single element of the array an
entire column changed.  Having no reference books or internet access
available, I tried to understand what was going on by creating some
small arrays on the command line to see if there was a difference
between explicit initialization and initialization with range() and the
* operator.

The arrays looked identical when printed and a == b returned True. Yet
the arrays were clearly not equivalent because mutating the
corresponding elements produced different outcomes.  I put the problem
aside until the next day when I looked at it some more and and created
the example script I posted.  Just as I was about to hit the Send
button, I realized that the * operator must have been creating
references instead of copies.  And then I appended the now much debated
opinion that == should have detected the difference.

(As an aside, may I point out that Python In A Nutshell states on page
46 "The result of S*n or n*S is the concatenation of n copies of S". It
might be well to put a warning in a future edition that this is not
strictly the case.)

My viewpoint is that of a working professional software consultant.
I'm essentially a pragmatist with no strong 'religious' views about
languages and methodologies.  As I noted in an earlier reply, I don't
realistically expect Python to change the behavior of the == operator.
I do think that a problem arose when it was adopted from C and extended
to allow comparison of containers.  In C, you can use it to compare
integers, floats, and pointers and everyone understands that p==q does
not imply *p == *q.  Moreover, compilers issue warnings about
comparisons between different types.

Basically, I'm looking for simple diagnostic tools that make it easy to
understand what's really going on when code produces an unexpected
result.  A 'strengthened equivalence' operator, to use your terminology
would have been useful to me.

As to constructing pseudocode for such an operator, I've appended a
working script below.  The counterexamples and questions from Slawomir,
Maric, and Jim were really useful in sharpening my thinking about the
matter.  I'm sure there are many ways to break it.  For example, tuples
have no index method, so one would have to be written. Still, I hope it
will serve to move the discussion beyond terms like 'crazy' and
'handwaving' and 'ill-founded'.  I haven't used such perjoratives in
any of my posts and would appreciate the same courtesy.

Cheers,
Mike

'''
StrongEquality -- a first cut at the definition proposed by M. Ellis.
Author: Michael F. Ellis, Ellis & Grant, Inc.
'''

def indices(item,seq):
   '''Utility function that returns a list of indices where item occurs
in seq'''
   result=[]
   for i in xrange(len(seq)):
   try:
  result.append(i+seq[i:].index(item))
   except ValueError:
  return result

def StrongEquality(a,b):
 '''True if a and b are numerically and "structurally" equal'''
 if a is b: return True
 if a != b: return False
 ## At this point we know a and b have the same length and
 ## evaluate numerically equivalent.  We now need to figure out
 ## whether there are any references to identical objects in
non-corresponding
 ## positions of a & b (per Slawomir's example). We also need to
inspect
 ## a and b for non-matching patterns of identical references (per
my example)
 ida=[] ; idb=[]
 for i in xrange(len(a)):
 if a[i] is b[i]:
 continue
 if isinstance(a[i], (int, float, str)) and isinstance(b[i],
(int, float, str)):
 continue## we already know they're numerically
equal

 ida.append(id(a[i]))
 idb.append(id(b[i]))
 ## We know that ida[n] is not idb[n] for all n because we
omitted all
 ## cases where a is b.  Therefore Slawomir's example is
detected if
 ## any id appears in both lists.
 for n in ida:
 if n in idb: return False
 ## Next we test for my example.  I'm sure this can be coded
more
 ## more elegantly ...
 for j in xrange(len(ida)):
  if indices(ida[j],ida) != indices(idb[j],idb): return
False
 ## Lastly, recurse ...
 if not StrongEquality(a[i],b[i]): return False

 return True

if __name__=='__main__':
   ## Rudimentary test cases
   assert StrongEquality(1,1)
   assert not StrongEquality(0,1)

   ## Slawomir's example
   x, y, z = [1],[1],[1]
   a, b = [x,y], [y,z]
   c, d = [[1],[1]], [[1],[1]]
   asser

Re: An oddity in list comparison and element assignment

2006-06-03 Thread michael . f . ellis
Hey Alex, lighten up! Python is a programming language -- not your
family, religion, or civil rights.
Cheers,
Mike

Alex Martelli wrote:
> <[EMAIL PROTECTED]> wrote:
>...
> > (As an aside, may I point out that Python In A Nutshell states on page
> > 46 "The result of S*n or n*S is the concatenation of n copies of S". It
> > might be well to put a warning in a future edition that this is not
> > strictly the case.)
>
> Can you give me an example where, say, for a sequence S,
>
> x = S * 3
>
> is not structurally the same as
>
> x = copy.copy(S) + copy.copy(S) + copy.copy(S)
>
> ...?  That is, where the "* 3" on a sequence is NOT the concatenation of
> three copies (ordinary copies, of course!) of that sequence?  I don't
> think you can... and I can't repeatedly explain or point to the
> distinction between normal, ordinary, shallow copies on one side, and
> "deep copies" on the other, every single time in which that distinction
> MIGHT be relevant (because some reader might not be aware of it); such
> endless repetition would bloat the Nutshell totally away from its role
> as a CONCISE desktop reference, and seriously hamper its usefulness
> (particularly by DESTROYING any trace of usefulness for anybody who's
> finally *GOT* this crucial bit, but not just in that way).
>
>
> > languages and methodologies.  As I noted in an earlier reply, I don't
> > realistically expect Python to change the behavior of the == operator.
>
> Then you might have avoided trying to convince anybody, or even trying
> to IMPLY, that in an ideal version of Python == *SHOULD* behave your way
> -- Python's semantics *ARE* entirely up for rediscussion at the moment,
> with an eye on the future "Python 3000" release, so this is one of the
> very rare periods of the history of the language where backwards
> incompatibility of a potential change is _NOT_ a blocking point.
>
> By asserting that your version of == would be "more natural", and trying
> to defend that assertion by vague handwaving references to maths and
> "real world", you managed to entirely shift MY mindstate (and possibly
> that of several other discussants) into one of total and absolute
> opposition to the proposal -- having by now spent considerable time and
> energy pondering and debating the issue, I am now entirely convinced
> that a language with such an == operator instead of Python's current one
> would be such a total, unadulterated disaster that I would refuse to use
> that language, no matter what other "good" features it might present to
> me.  I've walked away from great jobs, just because they would have
> required me to use some technology I just could not stand, more than
> once already in my life: and this IS how strongly (and negatively) I
> feel about your claim that, for built-in ==, your semantics would be in
> any way preferable to Python's.
>
> By managing to thus focus my mindset (and make me spend my energy and
> time) in opposition to your claims of "more natural", you have at least
> managed to ensure that I will not now lend any scrap of help or support
> to your debugging needs.  If you were as pragmatic as you claim to be,
> this kind of consideration WOULD have weighed highly in your choices.
>
> I.ie., if you had WANTED to attract any such support and help, a
> completely different attitude than that "most natural" claim would have
> been ENORMOUSLY more productive -- and your continuing attempts to
> debate that issue aren't helping AT ALL either:
>
> > I do think that a problem arose when it was adopted from C and extended
> > to allow comparison of containers.  In C, you can use it to compare
> > integers, floats, and pointers and everyone understands that p==q does
> > not imply *p == *q.
>
> If that is so, then everyone is utterly, totally, entirely, horribly
> *WRONG*, because, in C, p==q ***DOES*** imply *p == *q (whenever p --
> and by consequence q, given equality -- may legitimately be
> dereferenced: p == q == 0 does not imply anything about *p and/or *q,
> which may produce random results, crash the process, or whatever -- of
> course).
>
> You no doubt meant to say something entirely different from what you
> ACTUALLY said, but I respectfully suggest you spare your breath rather
> than keep trying to defend an indefensible position.
>
> I do NOT agree, and I cannot imagine any state of the world that would
> get me to agree, with your claim that "a problem arose" by allowing
> equality comparison of containers in Python (many other languages allow
> such comparisons, BTW; I would consider it a horrible wart if a language
> claiming to be "higher level" did NOT).  That you're "supporting" (HA!)
> your absurd claim with an equally absurd (and obviously, wholly false,
> unfounded, and misplaced) claim about C pointers doesn't "help", of
> course, but even perfectly accurate claims about C (or machine code, or
> Cobol, or RPG...) would be pretty much uninteresting and irrelevant.
>
> > Moreover, compilers issue warnings abou