Raymond Hettinger wrote: > If set-ifying a dictionary turns out to be an essential and > time-critical operation, it is possible to add some special case code > for this. The question is whether it is worth it. If you find the > need is pressing, file a feature request explaining why it is > essential. Then, I'll implement it for you. On the other hand, if > this is a theoretical nice-to-have, then why clutter up the code > (with pre-sizing and forcing all values to True).
Just today I was writing some code where I wanted to use sets for the abstraction (intersection, etc.), but also carry some values with them to process. So, yes, I believe that having set-like abstraction for dictionaries would be great. In fact, for a moment I wondered if dict.keys() was already of type set in 2.4, because it could have made sense. My concrete situation was a routine that was updating a current snapshot of data (stored in a dictionary) with a new snapshot of data (another dictionary). With 'updating' I mean some kind of algorithm where you have to do different things for: - items that were present in the current snapshot but not in the new snapshot anymore (obsolete items) - items that were not present in the current snapshot but are presente in the new snapshot (new items) - items that were present in both the current and the new snapshot (possibly modified items) So, this was a perfect suit for sets. Given my dictionaries d1 and d2, I would have liked to be able to do: for k,v in (d1 - d2).iteritems(): ... for k,v in (d2 - d1).iteritems(): ... for k,v in (d1 & d2).iteritems(): # uhm, not fully correct ... Of course, there are some nitpicks. For instance, in the last case the dictionary at the intersection should probably hold a tuple of two values, one coming from each dictionary (since the intersection finds the items whose key is present in both dictionaries, but the value can obviously be different). So you'd probably need something: for k,(v1,v2) in (d1 & d2).iteritems(): ... Another solution would be to say that set-like operations, like (d1 & d2), return an iterator to a sequence of keys *only*, in this case the sequence of keys available in both dictionaries. I also tried a different approach, that is putting my data in some structure that was suitable for a set (with __hash__ and __key__ methods that were caring of the key part only): class FakeForSet: def __init__(self, k, v): self.key = key self.value = value def __hash__(self): return hash(self.key) def __cmp__(self, other): return cmp(self.key, other.key) but this looked immediatly weird because I was lying to Python somehow. It then became clear that it's not going to work, because when you go through (s1 & s2) you do not know if the items you get are coming from s1 or s2, and that is important because the value member is going to be different (even if you're lying to Python saying that those items are equal anyway). I know of course there are other ways of doing the same thing, like: # intersection for k in d1: if k not in d2: continue v1, v2 = d1[k], d2[k] ... But I don't think there is anything wrong in providing an abstraction for this, especially since we already decided that set-like abstractions are useful. So, FWIW, I would find set-like operations on dictionaries an useful addition to Python. Besides, I guess they can be easily experimented and studied by subclassing dict. -- Giovanni Bajo -- http://mail.python.org/mailman/listinfo/python-list