Re: [Python-Dev] Why should the default hash(x) == id(x)?

2005-11-06 Thread Martin v. Löwis
Noam Raphael wrote:
> The alternative is to drop the __hash__ method of user-defined classes
> (as Guido already decided to do), and to make the default __eq__
> method compare the two objects' __dict__ and slot members.

The question then is what hash(x) would do. It seems that you expect
it then somehow not to return a value. However, under this patch,
the fallback implementation (use pointer as the hash) would be used,
which would preserve hash(x)==id(x).

> See the thread about default equality operator - Josiah Carlson posted
> there a metaclass implementing this equality operator.

This will likely cause a lot of breakage. Objects will compare equal
even though they conceptually are not, and even though they did not
compare equal in previous Python versions.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Jim Fulton

The recent discussion about what the default hash and equality comparisons
should do makes me want to chime in.

IMO, the provision of defaults for hash, eq and other comparisons
was a mistake.  I'm especially sensitive to this because I do a lot
of work with persistent data that outlives program execution. For such
objects, memory address is meaningless.  In particular, the default
ordering of objects based in address has caused a great deal of pain
to people who store data in persistent BTrees.

Oddly, what I've read in these threads seems to be arguing about
which implicit method is best.  The answer, IMO, is to not do this
implicitly at all.  If programmers want their objects to be
hashable, comparable, or orderable, then they should implement operators
explicitly.  There could even be a handy, but *optional*, base class that
provides these operators based on ids.

This would be too big a change for Python 2 but, IMO, should definately
be made for Python 3k.  I doubt any change in the default definition
of these operations is practical for Python 2.  Too many people rely on
them, usually without really realizing it.

Lets plan to stop guessing how to do hash and comparison.

Explicit is better than implicit. :)

Jim

-- 
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Guido van Rossum
On 11/6/05, Jim Fulton <[EMAIL PROTECTED]> wrote:
> IMO, the provision of defaults for hash, eq and other comparisons
> was a mistake.

I agree with you for 66%. Default hash and inequalities were a
mistake. But I wouldn't want to do without a default ==/!=
implementation (and of course it should be defined so that an object
is only equal to itself).

In fact, the original hash() was clever enough to complain when __eq__
(or __cmp__) was overridden but __hash__ wasn't; but this got lost by
accident for new-style classes when I added a default __hash__ to the
new universal base class (object). But I think the original default
hash() isn't particularly useful, so I think it's better to just not
be hashable unless __hash__ is defined explicitly.

> I'm especially sensitive to this because I do a lot
> of work with persistent data that outlives program execution. For such
> objects, memory address is meaningless.  In particular, the default
> ordering of objects based in address has caused a great deal of pain
> to people who store data in persistent BTrees.

This argues against the inequalities (<, <=, >, >=) and I agree.

> Oddly, what I've read in these threads seems to be arguing about
> which implicit method is best.  The answer, IMO, is to not do this
> implicitly at all.  If programmers want their objects to be
> hashable, comparable, or orderable, then they should implement operators
> explicitly.  There could even be a handy, but *optional*, base class that
> provides these operators based on ids.

I don't like that final suggestion. Before you know it, a meme
develops telling newbies that all classes should inherit from that
"optional" base class, and then later it's impossible to remove it
because you can't tell whether it's actually needed or not.

> This would be too big a change for Python 2 but, IMO, should definately
> be made for Python 3k.  I doubt any change in the default definition
> of these operations is practical for Python 2.  Too many people rely on
> them, usually without really realizing it.

Agreed.

> Lets plan to stop guessing how to do hash and comparison.
>
> Explicit is better than implicit. :)

Except that I really don't think that there's anything wrong with a
default __eq__ that uses object identity. As Martin pointed out, it's
just too weird that an object wouldn't be considered equal to itself.
It's the default __hash__ and __cmp__ that mess things up.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Jim Fulton
Guido van Rossum wrote:
> On 11/6/05, Jim Fulton <[EMAIL PROTECTED]> wrote:
> 
...
> Except that I really don't think that there's anything wrong with a
> default __eq__ that uses object identity. As Martin pointed out, it's
> just too weird that an object wouldn't be considered equal to itself.
> It's the default __hash__ and __cmp__ that mess things up.

Good point.  I agree.

Jim

-- 
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread John Williams
(This is kind of on a tangent to the original discussion, but I don't 
want to create yet another subject line about object comparisons.)

Lately I've found that virtually all my implementations of __cmp__, 
__hash__, etc. can be factored into this form inspired by the "key" 
parameter to the built-in sorting functions:

class MyClass:

   def __key(self):
 # Return a tuple of attributes to compare.
 return (self.foo, self.bar, ...)

   def __cmp__(self, that):
 return cmp(self.__key(), that.__key())

   def __hash__(self):
 return hash(self.__key())

I wonder if it wouldn't make sense to formalize this pattern with a 
magic __key__ method such that a class with a __key__ method would 
behave as if it had interited the definitions of __cmp__ and __hash__ above.

This scheme would eliminate the tedium of keeping the __hash__ method in 
sync with the __cmp__/__eq__ method, and writing a __key__ method would 
involve writing less code than a naive __eq__ method, since each 
attribute name only needs to be mentioned once instead of appearing on 
either side of a "==" expression.

On the other hand, this idea doesn't work in all situations (for 
instance, I don't think you could define the default __cmp__/__hash__ 
semantics in terms of __key__), it would only eliminate two one-line 
methods for each class, and it would further complicate the "==" 
operator (__key__, falling back to __eq__, falling back to __cmp__, 
falling back to object identity--ouch!)

If anyone thinks this is a good idea I'll investiate how many places in 
the standard library this pattern would apply.

--jw
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Guido van Rossum
On 11/6/05, John Williams <[EMAIL PROTECTED]> wrote:
> (This is kind of on a tangent to the original discussion, but I don't
> want to create yet another subject line about object comparisons.)
>
> Lately I've found that virtually all my implementations of __cmp__,
> __hash__, etc. can be factored into this form inspired by the "key"
> parameter to the built-in sorting functions:
>
> class MyClass:
>
>def __key(self):
>  # Return a tuple of attributes to compare.
>  return (self.foo, self.bar, ...)
>
>def __cmp__(self, that):
>  return cmp(self.__key(), that.__key())
>
>def __hash__(self):
>  return hash(self.__key())

The main way this breaks down is when comparing objects of different
types. While most comparisons typically are defined in terms of
comparisons on simpler or contained objects, two objects of different
types that happen to have the same "key" shouldn't necessarily be
considered equal.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Phillip J. Eby
At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote:
>The main way this breaks down is when comparing objects of different
>types. While most comparisons typically are defined in terms of
>comparisons on simpler or contained objects, two objects of different
>types that happen to have the same "key" shouldn't necessarily be
>considered equal.

When I use this pattern, I often just include the object's type in the 
key.  (I call it the 'hashcmp' value, but otherwise it's the same pattern.)

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Guido van Rossum
On 11/6/05, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote:
> >The main way this breaks down is when comparing objects of different
> >types. While most comparisons typically are defined in terms of
> >comparisons on simpler or contained objects, two objects of different
> >types that happen to have the same "key" shouldn't necessarily be
> >considered equal.
>
> When I use this pattern, I often just include the object's type in the
> key.  (I call it the 'hashcmp' value, but otherwise it's the same pattern.)

But how do you make that work with subclassing? (I'm guessing your
answer is that you don't. :-)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Josh Hoyt
On 11/6/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> On 11/6/05, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> > When I use this pattern, I often just include the object's type in the
> > key.  (I call it the 'hashcmp' value, but otherwise it's the same pattern.)
>
> But how do you make that work with subclassing? (I'm guessing your
> answer is that you don't. :-)

If there is a well-defined desired behaviour for comparisons in the
face of subclassing (which I'm not sure if there is) then that
behaviour could become part of the definition of how __key__ works.
Since __key__ would be for clarity of intent and convenience of
implementation, adding default behaviour for the most common case
seems like it would be a good idea.

My initial thought was that all subclasses of the class where __key__
was defined would compare as equal if they return the same value. More
precisely, if two objects have the same __key__ method, and it returns
the same value, then they are equal. That does not solve the __cmp__
problem, unless the __key__ function is used as part of the ordering.

For example:

def getKey(obj):
__key__ = getattr(obj.__class__, '__key__')
return (id(key), key(obj))

An obvious drawback is that if __key__ is overridden, then the
subclass where it is overridden and all further subclasses will no
longer have equality to the superclass. I think that this is probably
OK, except that it may be occasionally surprising.

Josh
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Josiah Carlson

John Williams <[EMAIL PROTECTED]> wrote:
> 
> (This is kind of on a tangent to the original discussion, but I don't 
> want to create yet another subject line about object comparisons.)
> 
> Lately I've found that virtually all my implementations of __cmp__, 
> __hash__, etc. can be factored into this form inspired by the "key" 
> parameter to the built-in sorting functions:
> 
> class MyClass:
> 
>def __key(self):
>  # Return a tuple of attributes to compare.
>  return (self.foo, self.bar, ...)
> 
>def __cmp__(self, that):
>  return cmp(self.__key(), that.__key())
> 
>def __hash__(self):
>  return hash(self.__key())
> 
> I wonder if it wouldn't make sense to formalize this pattern with a 
> magic __key__ method such that a class with a __key__ method would 
> behave as if it had interited the definitions of __cmp__ and __hash__ above.

You probably already realize this, but I thought I would point out the
obvious.  Given a suitably modified MyClass...

>>> x = {}
>>> a = MyClass()
>>> a.a = 8
>>> x[a] = a
>>> a.a = 9
>>> x[a] = a
>>>
>>> x
{<__main__.MyClass instance at 0x007E0A08>: <__main__.MyClass instance at 0x007E
0A08>, <__main__.MyClass instance at 0x007E0A08>: <__main__.MyClass instance at
0x007E0A08>}

Of course everyone is saying "Josiah, people shouldn't be doing that";
but they will.  Given a mechanism to offer hash-by-value, a large number
of users will think that it will work for what they want, regardless of
the fact that in order for it to really work, those attributes must be
read-only by semantics or access mechanisms.  Not everyone who uses
Python understands fully the concepts of mutability and immutability,
and very few will realize that the attributes returned by __key() need
to be immutable aspects of the instance of that class (you can perform
at most one assignment to the attribute during its lifetime, and that
assignment must occur before any hash calls).


Call me a pessimist, but I don't believe that using magical key methods
will be helpful for understanding or using Python.

 - Josiah

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Phillip J. Eby
At 01:29 PM 11/6/2005 -0800, Guido van Rossum wrote:
>On 11/6/05, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> > At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote:
> > >The main way this breaks down is when comparing objects of different
> > >types. While most comparisons typically are defined in terms of
> > >comparisons on simpler or contained objects, two objects of different
> > >types that happen to have the same "key" shouldn't necessarily be
> > >considered equal.
> >
> > When I use this pattern, I often just include the object's type in the
> > key.  (I call it the 'hashcmp' value, but otherwise it's the same pattern.)
>
>But how do you make that work with subclassing? (I'm guessing your
>answer is that you don't. :-)

By either changing the subclass __init__ to initialize it with a different 
hashcmp value, or by redefining the method that computes it.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] For Python 3k, drop default/implicit hash, and comparison

2005-11-06 Thread Phillip J. Eby
At 07:12 PM 11/6/2005 -0500, Phillip J. Eby wrote:
>At 01:29 PM 11/6/2005 -0800, Guido van Rossum wrote:
> >On 11/6/05, Phillip J. Eby <[EMAIL PROTECTED]> wrote:
> > > At 12:58 PM 11/6/2005 -0800, Guido van Rossum wrote:
> > > >The main way this breaks down is when comparing objects of different
> > > >types. While most comparisons typically are defined in terms of
> > > >comparisons on simpler or contained objects, two objects of different
> > > >types that happen to have the same "key" shouldn't necessarily be
> > > >considered equal.
> > >
> > > When I use this pattern, I often just include the object's type in the
> > > key.  (I call it the 'hashcmp' value, but otherwise it's the same 
> pattern.)
> >
> >But how do you make that work with subclassing? (I'm guessing your
> >answer is that you don't. :-)
>
>By either changing the subclass __init__ to initialize it with a different
>hashcmp value, or by redefining the method that computes it.

Scratch that.  I realized 2 seconds after hitting "Send" that you meant the 
case where you want to compare instances with a common parent type.  And 
the answer is, I can't recall having needed to.  (Which is probably why it 
took me so long to realize what you meant.)

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP submission broken?

2005-11-06 Thread Bryan Olson

Though I tried to submit a (pre-) PEP in the proper form through the proper
channels, it has disappeared into the ether.


In building a class that supports Python's slicing interface,

   http://groups.google.com/group/comp.lang.python/msg/8f35464483aa7d7b

I encountered a Python bug, which, upon further discussion, seemed to be
a combination of a wart and a documentation error.

 
http://groups.google.com/group/comp.lang.python/browse_frm/thread/402d770b6f503c27

I submitted the bug report via SourceForge; the resolution was to document
the actual behavior.  Next I worked out what behavior I think would 
eliminate
the wart, wrote it up as a pre-PEP, and sent it [EMAIL PROTECTED] on 27 Aug of
this year.

I promptly received an automated response from Barry Warsaw, saying, in 
part,
"I get so much email that I can't promise a personal response."  I 
gathered that
he is a PEP editor. I did not infer from his reply that PEP's are simply 
ignored, but
this automated reply was the only response I ever received. I subscribed 
to the
Python-dev list, and watched, and waited; nothing on my concern appeared.


One response on the comp.lang.python newsgroup noted that a popular
extention module would have difficulty maintaining consistency with my
proposed PEP.  My proposal does not break how the extension currently
works, but still, that's a valid point. There are variations which do 
not have
that problem, and I think I can see a  course that will serve the entire
Python community. From what I can tell, We need to address fixing the
PEP process before there is any point in working on PEP's,



-- 
--Bryan
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com