On Wed, 19 Sep 2007 20:58:03 +0000, Karthik Gurusamy wrote: > While it's easy to explain the behavior, I think the decision to dis- > allow mutable items as keys is a bit arbitrary. There is no need for > dict to recompute hash
What??? Of course it does. How else can it look up the key? Because it (somehow) just recognizes that it has seen the key before? How? By magic? > (first of all, a user doesn't even need to know > if underneath 'hashing' is used -- the service is just a mapping between > one item to another item). The user doesn't need to know the mechanism, but the dict does. Dicts are implemented as hash tables. I suppose they could be implemented as something else (what? linear lists? some sort of tree?) but the same considerations must be made: the dict must be able to find keys it has seen before. How is the dict supposed to recognise the key if the key has changed? > Since we know hashing is used, all that is needed is, a well-defined way > to construct a hash out of a mutable. "Given a sequence, how to get a > hash" is the problem. Nonsense. That's not the problem. The problem is how to get exactly the same hash when the sequence has changed. In other words, if you have two mutable objects M1 and M2, then you expect: hash(M1) == hash(M2) if and only if M1 and M2 are equal hash(M1) != hash(M2) if M1 and M2 are unequal but also: if M1 mutates to become equal to M2, hash(M1) must remain the same while still being different from hash(M2). That means that hash() now is a non-deterministic function. hash([1,2,3]) will vary according to how the list [1,2,3] was constructed! Obviously something has to give, because not all of these things are mutually compatible. > If later the given sequence is different, that's > not the dict's problem. Data structures don't have problems. Programmers do. And language designers with sense build languages that minimize the programmers problems, not maximize them. > So if the list changes, it will result in a different hash and we will > get a hash-miss. I doubt this is in anyway less intuitive than dis- > allowing mutable items as keys. The choices for the language designer are: (1) Invent some sort of magical non-deterministic hash function which always does the Right Thing. (2) Allow the hash of mutable objects to change, which means you can use mutable objects as keys in dicts but if you change them, you can no longer find them in the dict. They'll still be there, using up memory, but you can't get to them. (3) Simply disallow mutable objects as keys. Alternative 1 is impossible, as we've seen, because the requirements for the Right Thing are not mutually compatible. Alternative (2) leads to hard-to-find, hard-to-diagnose bugs where you store objects in a dict only for them to mysteriously disappear later. Worse, it could lead to bugs like the following hypothetical: >>> M = [1, 2, 3] >>> D = {M: 'parrot'} # pretend this works >>> D {[1, 2, 3]: 'parrot'} >>> M.append(4) >>> D {[1, 2, 3, 4]: 'parrot'} >>> D[[1, 2, 3, 4]] Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: [1, 2, 3, 4] Try explaining that one to programmers: they can SEE the key in the dict when they print it, but they can't get it or delete it because the hash has changed. Alternative 3 is easy to deal with: simply don't use mutable objects as keys. That's what Python does. Sure, the programmer sometimes needs to work around the lack (convert the list into a tuple, or a string, or pickle it, whatever...) which on rare occasions is hard to do, but at least there are no mysterious, hard to track down bugs. -- Steven. -- http://mail.python.org/mailman/listinfo/python-list