Christoph Zwerschke wrote: > But of course, it will always be slower since it is constructed on top > of the built-in dict. In end effect, you always have to maintain a > sequence *plus* a dictionary, which will be always slower than a sheer > dictionary. The ordered dictionary class just hides this uglyness of > having to maintain a dictionary plus a sequence, so it's rather an issue > of convenience in writing and reading programs than a performance issue. > > It may be different if the ordered dict would be implemented directly as > an ordered hash table in C.
The problem with hashing is that one is storing data from a possibly wildly varying range in a set of values with a limited range. That's where the ordering problems come from in the first place. If one wants to solve it once and for all one has to give up the speed advantage that makes hashing so popular. I wonder if optimized C code using bisect would be very much slower for small ranges? The current set implementation uses dicts to implement sets, while sets are a more basic data type than dicts. At least dicts and sets should be derived from the same type. Just speaking from an intuitive point of view of course :-) Here's a sorted dict implementation without using hashes, which maybe would be fast if implemented in C. The insertion order can be retrieved using the keys and values lists from the object itself, items() gives a sorted sequence. Anton. NB warning untested code. When using mutables as keys which is possible by this implementation, don't change the keys after they're used in the object. from array import array class VDict: def __init__(self,sequence = []): self.keys = [] self.values = [] self.unranks = array('L',[]) for key,value in sequence: self[key] = value def __setitem__(self,key,value): keys = self.keys values = self.values unranks = self.unranks n = len(keys) i = self.bisect_left(key) if i == n or keys[unranks[i]] != key: keys.append(key) values.append(value) unranks.insert(i,n) else: values[i] = value def __getitem__(self,key): i = self.bisect_left(key) return self.values[self.unranks[i]] def bisect_left(self,key, lo=0, hi=None): keys = self.keys unranks = self.unranks if hi is None: hi = len(keys) while lo < hi: mid = (lo+hi)//2 if keys[unranks[mid]] < key: lo = mid+1 else: hi = mid return lo def __contains__(self,key): keys = self.keys unranks = self.unranks n = len(keys) i = self.bisect_left(key) return i < n and keys[unranks[i]] == key def __len__(self): return len(self.keys) def items(self): keys = self.keys values = self.values unranks = self.unranks return [(keys[i],values[i]) for i in unranks] def __iter__(self): return iter(self.items()) def remove(self,key): keys = self.keys values = self.values unranks = self.unranks n = len(keys) i = self.bisect_left(key) x = unranks[i] if i < n and keys[x] == key: del keys[x] del values[x] del unranks[i] for j,k in enumerate(unranks): if k > x: unranks[j] -= 1 def test(): V = VDict() L = [1,2,3] V[L] = 10 print V[L] V[L] = 20 print V[L] V.remove(L) print V.items() V = VDict(zip('edcba',range(5))) print V.items() print V['d'] V.remove('d') print V.items() if __name__=='__main__': test() -- http://mail.python.org/mailman/listinfo/python-list