On Mon, 24 Oct 2005 09:23:58 +0000, Antoon Pardon wrote: > Op 2005-10-21, Christopher Subich schreef <[EMAIL PROTECTED]>: >> Antoon Pardon wrote: >>> It would be better if cmp would give an indication it >>> can't compare two objects instead of giving incorrect >>> and inconsistent results. >> >> If two objects aren't totally comparable, then using 'cmp' on them is >> ill-defined to begin with. > > Where is that documented?
Surely that is so obvious, it doesn't need documenting? If comparisons aren't defined between two things ("which is longer, water or fire?") then comparing them is ill-defined. On second thoughts... no, I don't believe it is so obvious. People are used to thinking about comparisons in the context of numbers and strings, and it won't hurt to remind them that not every pair of objects can be compared. >> The Standard Thing To Do is throw an >> exception; see the Highly Obscure Case of the Complex Numbers. >> >> >>>1 == 1j >> False >> >>>1 != 1j >> True >> >>>1 < 1j >> Traceback (most recent call last): >> File "<stdin>", line 1, in ? >> TypeError: cannot compare complex numbers using <, <=, >, >= > > I would say this is the wrong answer. The right answer should > be False IMO. I'm afraid that mathematically you are wrong. Greater and less than is not defined for complex numbers. I think I see where you are coming from: you are reading "1 < 1j" as "is one less than one*j", the answer to which is "no". That's a sensible answer to a sensible question, except for two problems: (1) you will upset the mathematical purists, who argue that the English sentence is not quite the same as the mathematical relation; and (2) you will upset the programmers, since your suggestion would create a whole class of gotchas and bugs, where people wrongly assume that since 1<1j is False, 1>1j should be True. This is one case where practicality *is* purity. > Especially in the following case do I think the TypeError is the wrong > answer: > >>>> 3 + 0j < 2 + 0j > Traceback (most recent call last): > File "<stdin>", line 1, in ? > TypeError: cannot compare complex numbers using <, <=, >, >= That's an interesting case. A mathematical purist might argue that ordering is not defined for complex numbers, even if the imaginary component is zero. A *different* purist might argue: 3 = 3+0j 2 = 2+0j then since 3 > 2 it should also be true that 3+0j > 2+0j. I can see merit in both arguments, but I'm inclined to go with the second argument, if and only if the imaginary component is exactly zero. I take comfort in the fact that mathematicians allow continued fractions with zero terms even though dividing by zero is verboten. > Look at sets: > >>>> set([1]) <= set([2]) > False >>>> set([2]) <= set([1]) > False No, sorry, you are making a mistake. In the context of sets, the < operator doesn't mean "less than", it means "is a subset of". It is correct to say neither "set([1]) is a subset of set([2])" nor vice versa is true. Both are false. By analogy, one can ask, "is the cat inside the box?" and get the answer "No", but this does not imply that therefore the box must be inside the cat. [snip] >> So your use-case is already well-defined, and rather simple. Make >> __cmp__ raise an exception of your choice, > > That is not well defined. It doesn't allow to distinghuish between > something went wrong in __cmp__ and and answer that indicates the only > usefull thing you can say is that they are unequal. Agreed there: there should be a distinct exception for trying to compare incomparable objects, rather than ValueError or TypeError. > Besides I find the following result inconsistent: > >>>> set([1]) <= set([1,2]) > True >>>> cmp(set([1]), set([1,2])) > Traceback (most recent call last): > File "<stdin>", line 1, in ? > TypeError: cannot compare sets using cmp() > > The first set is smaller than the second so acoording to the > documentation, I should get a negative number as a result. No, the first expression tests whether set([1]) is a subset of set([1,2]), not whether it is "smaller". Smaller and bigger aren't defined for sets. Informally, we can say that one set is smaller than another if it has fewer members, but that's only one possible definition of "smaller" out of many. Think, for example, of the set of all US military forces. That has one member. Compare it to the set of all US presidents whose surname is or was Bush. That set has two members. Do you think it is sensible definition of "smaller" to say that the set of US military forces is smaller than the set of Presidents called Bush? [snip] >> then your >> _ONLY_ choices, no matter your implementation, is to return some >> possibly inconsistent result (a < b == 1, b < c == 1, a < c == 0) or >> raise an exception for unapplicable comparisons. >> >> This isn't a Python problem; this is a problem from formal mathematics. > > I think it is python's problem if python gives the wrong results. > > when a < b then cmp(a,b) should give a negative number as a result. > Python sometimes does not! > > when not a < b then cmp(a,b) should not give a negative number as a > result. Python sometimes does! > > I think those are python problems. I think you need to give examples of where < or > *defined as comparisons* returns the wrong results. The examples from sets do not count, because the operators aren't being used for comparisons. >> Personally, I'd favor the "raise an exception" case, which is exactly >> what will happen if you define rich comparisons and let cmp throw an >> exception. > > But I don't want cmp to raise an exception when the two elements are > comparable. > > That cmp(set([1]), set([2])) would raise an exception, I can understand, > although I would prefer it returned None, but why should cmp(set([1]), > set([1,2])) raise an exception. There is a perfectly well defined answer > for this. Alas, there is not. First you have to define what "less than" and "greater than" means for sets, and there is no well-defined answer to that. They certainly don't mean "is a subset" and "is a superset". Nor is it sensible to define them in terms of the number of elements: def lt(set1, set2): """Defines less than for sets in terms of the number of elements.""" return len(set1) < len(set2) def gt(set1, set2): """Defines greater than for sets in terms of the number of elements.""" return len(set1) > len(set2) py> lt( set([1]), set([1, 2]) ) True py> gt( set([1]), set([1, 2]) ) False So far so good. But now: py> lt( set([1]), set([2]) ) False py> gt( set([1]), set([2]) ) False py> set([1]) == set([2]) False So set([1]) is neither less than, nor greater than, nor equal to set([2]). >> Operators that assume comparable objects, like sort, also almost always >> assume a total order -- inconsistent operations can give weird results, >> while throwing an exception will at least (usually) give some sort of >> error that can be caught. > > No it wont. the sort method will happily sort a list of sets. > >>>> lst > [set([1]), set([2])] >>>> lst.sort() That's an accident of implementation. Sorting nows uses only <, rather than __cmp__, and since < (subset) for sets is defined, sort blindly goes and sorts the list. However, look at this: py> from sets import Set as set py> L1 = [set([1]), set([2]), set([3])] py> L2 = [set([2]), set([3]), set([1])] Notice that L1 and L2 contain the same elements, just in different orders. Sorting those two lists should give the same order, correct? py> L1.sort() py> L2.sort() py> L1 [Set([1]), Set([2]), Set([3])] py> L2 [Set([2]), Set([3]), Set([1])] Should, but doesn't. Oops. That's a bug. Personally, I argue that sorting is something that you do to lists, and that all lists should be sortable regardless of whether the objects within them have a total order or not. If not, the list should impose a sensible ordering on them, such as (perhaps) lexicographical order ("dictionary order"). Look at it this way: books do not have a total ordering. What does it mean to say that Tolstoy's "War and Peace" is less than Tom Clancy's "The Hunt For Red October"? Are we comparing by title? Author? Contents? Cost of the book, dimensions of the book, number of pages, some subjective idea of quality, average word length, maximum sentence length, publication date, number of protagonists, or what? And yet libraries somehow find no difficulty in sorting shelves of hundreds or thousands of books. So in my opinion, sorting is a red-herring. In an ideal world (Python 3 perhaps?) sorting will be divorced from comparisons, or at least seeing other people. But that still leaves the problem of comparisons. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list