This has been bothering me for a while. Just want to find out if it just me or perhaps others have thought of this too: Why shouldn't the keyset of a dictionary be represented as a set instead of a list? I know that sets were introduced a lot later and lists/dictionaries were used instead but I think "the only correct way" now is for the dictionary keys and values to be sets. Presently {1:0,2:0,3:0}.keys() will produce [1,2,3] but it could also produce [3,1,2] or [3,2,1] on a different machine architecture. The Python documentation states that: """ Keys and values are listed in an _arbitrary_(my emphasis) order which is non-random, varies across Python implementations, and depends on the dictionary's history of insertions and deletions. """
So on the same machine it will be the case that: {1:0, 2:0, 3:0}.keys() == {3:0, 2:0, 1:0}.keys() is True. But if there are 2 lists of keys of the same dictionary, one is pickled on another machine, with a different "arbitrary non-random" ordering, then the keys will not be equal to each other. It seems like the problem could be solved by returning a set instead. The same thing goes for the values(). Here most people will argue that values are not necessarily unique, so a list is more appropriate, but in fact the values are unique it is just that more than one key could map to the same value. What is 'seen' in dictionary is the mapping rule. Also the values are not ordered and should not be indexable -- they should be a set. Just as the keys(), one ordered list of values from a dictionary on one machine will not necessarily be equal to another list of values an on another machine, while in fact, they should be. On a more fundamental and general level, a dictionary is actually an explicit function, also called a 'map'. A set of keys, traditionally called a 'domain' are mapped to a set of values, traditionally called a 'range'. This mapping produces at most a surjective function (i.e. two or more keys can map to same value and all the values are mapped to by some keys). Notice that the traditional counterparts to keys and values are sets and not just lists. This seems like theory babble, but in the case of Python staying 'true' to the theory is usually a GoodThing(tm). I love Python primarily because it does something in only one, correct, and reasonable way. The same principle should probably apply to Python itself including to its built-in data structures. Of course the compatibilty will be broken. Any code relying on a certain ordering of keys() or values() returned would need to be updated. One could argue though that such code was not entirely correct in the first place to asssume that, and should be fixed anyway. Obviously this fix should not be in Python 2.X but perhaps would be worth considering for Python 3000. What does everyone think? -- http://mail.python.org/mailman/listinfo/python-list