Peter Otten <__pete...@web.de>: > Steven D'Aprano wrote: >> On Wed, 09 Aug 2017 20:07:48 +0300, Marko Rauhamaa wrote: >> >>> Good point! A very good __hash__() implementation is: >>> >>> def __hash__(self): >>> return id(self) >>> >>> In fact, I didn't know Python (kinda) did this by default already. I >>> can't find that information in the definition of object.__hash__(): >> >> >> Hmmm... using id() as the hash would be a terrible hash function.
id() is actually an ideal return value of __hash__(). The only criterion is that the returned number should be different if the __eq__() is False. That is definitely true for id(). > It's actually id(self) >> 4 (almost, see C code below), to account for > memory alignment. Memory alignment makes no practical difference. It it is any good, the internal implementation will further scramble and scale the returned hash value. For example: index = hash(obj) % prime_table_size >> would fall into similar buckets if they were created at similar >> times, regardless of their value, rather than being well distributed. > > If that were the problem it wouldn't be solved by the current approach: It is not a problem. Hash values don't need to be well distributed, they simply need to be discerning to tiny differences in equality. >>>> sample = [object() for _ in range(10)] >>>> [hash(b) - hash(a) for a, b in zip(sample, sample[1:])] > [1, 1, 1, 1, 1, 1, 1, 1, 1] Nice demo :-) Marko -- https://mail.python.org/mailman/listinfo/python-list