ben.tay...@email.com wrote:
Found this while trying to do something unrelated and was curious...
If you hash an integer (eg. hash(3)) you get the same integer out. If
you hash a string you also get an integer. If you hash None you get an
integer again, but the integer you get varies depending on which
machine you're running python on (which isn't true for numbers and
strings).
This raises the following questions:
1. Is it correct that if you hash two things that are not equal they
might give you the same hash value? Like, for instance, None and the
number 261862182320 (which is what my machine gives me if I hash
None). Note this is just an example, I'm aware hashing integers is
probably daft. I'm guessing that's fine, since you can't hash
something to a number without colliding with that number (or at least
without hashing the number to something else, like hashing every
number to itself * 2, which would then mean you couldn't hash very
large numbers)
2. Should the hash of None vary per-machine? I can't think why you'd
write code that would rely on the value of the hash of None, but you
might I guess.
3. Given that presumably not all things can be hashed (since the
documentation description of hash() says it gives you the hash of the
object "if it can be hashed"), should None be hashable?
Bit esoteric perhaps, but like I said, I'm curious. ;-)
Ben
1. Most definitely. Every definition of hash (except for "perfect
hash") makes it a many-to-one mapping. Its only intent is to reduce the
likelihood of collision between dissimilar objects. And Python's spec
that says that integers, longs and floats that are equal are guaranteed
the same hash value is a new one for me. Thanks for making me look it up.
2. Nothing guarantees that the Python hash() will return the same value
for the same object between implementations, or even between multiple
runs with the same version on the same machine. In fact, the default
hash for user-defined classes is the id() of the object, which will
definitely vary between program runs. Currently, id() is implemented to
just return the address of the object.
3. Normally, it's just mutable objects that are unhashable. Since None
is definitely immutable, it should have a hash. Besides, if it weren't
hashable, it couldn't be usable as a key in a dictionary.
All my opinions, of course.
DaveA
--
http://mail.python.org/mailman/listinfo/python-list