On Mar 7, 5:46 pm, Steven D'Aprano <st...@remove-this- cybersource.com.au> wrote: > Given that Counter supports negative counts, it looks to me that the > behaviour of __add__ and __sub__ is fundamentally flawed. You should > raise a bug report (feature enhancement) on the bug tracker.
It isn't a bug. I designed it that way. There were several possible design choices, each benefitting different use cases. FWIW, here is the reasoning behind the design. The basic approach to Counter() is to be a dict subclass that supplies zero for missing values. This approach places almost no restrictions on what can be stored in it. You can store floats, decimals, fractions, etc. Numbers can be positive, negative, or zero. This design leaves users with a good deal of flexibility, but more importantly it keeps the API very close to regular dictionaries (thus lowering the learning curve). It also makes basic counter operations very fast. The update() method differs from regular dictionaries because dict.update() isn't very helpful in a counter context. Instead, it allows one counter to update another. Like the other basic counter operations, it places no restrictions on type (i.e. it works on with ints, floats, decimals, fractions, etc). or on sign (positive, negative, or zero). The only requirement is that the values support addition. The basic API also adds some convenience methods. The most_common() method tries to not be restrictive. It only requires the count values be orderable. For obvious reasons, the elements() method *does* have restrictions and won't work with float values and won't emit entries with non-positive counts. Beyond the basic API listed above which is straight-forward, the area where there were more interesting design choices were the Counter-to-Counter operations (__add__, __sub__, __or__, and __and__). One possible choice (the one preferred by the OP) was to has addition and subtraction be straight adds and subtracts without respect to sign and to not support __and__ and __or__. Straight addition was already supported via the update() method. But no direct support was provided for straight subtractions that leave negative values. Sorry about that. Instead the choice was to implement the four methods as multiset operations. As such, they need to correspond to regular set operations. For example, with sets: set('abc') - set('cde') # gives set('ab') Notice how subtracting 'e' did not produce a negative result? Now with multisets, we want the same result: Counter({'a':1, 'b':1, 'c':1'}) - Counter({'c':1, 'd':1, 'e':1}) So the design decision was to support multiset operations that produces only results with positive counts. These multiset-style mathematical operations are discussed in: Knuth's TAOCP Volume II section 4.6.3 exercise 19 and at http://en.wikipedia.org/wiki/Multiset . Also C++ has multisets that support only positive counts. So, there you have it. There design tries to be as unrestrictive as possible. When a choice had to be made, it favored multisets over behaviors supporting negative values. Fortunately, it is trivially easy to extend the class to add any behavior you want. class MyCounter(Counter): def __sub__(self, other): result = self.copy() for elem, cnt in other.items(): result[elem] -= cnt Hopes this gives you some insight into the design choices. Raymond Hettinger -- http://mail.python.org/mailman/listinfo/python-list