On Feb 1, 10:29 pm, Paul Rubin <http://[EMAIL PROTECTED]> wrote: > Dan Stromberg <[EMAIL PROTECTED]> writes: > > > * Is the feature useful for the broad mass? > > > Yes, probably, at least if this kind of feature's inclusion in other > > languages and my two recent needs for it are any indication. In other > > languages, they are sometimes called bags or multisets. > > I use defaultdict(int) all the time, but it's been fairly rare > to want to do set operations (union, intersection, difference) on > these. I'm not even sure what the intersection should mean.
Let me write the multiset with 3 x's and 2y y' as {2x, 3y} (this is not a suggested Python syntax!). Note: in the following my multisets can have negative multiplicities, whereas the usual definition only accepts positive multiplicities. Let H be the collection of all python hashable objects. Let Z be the set of integers. So a multiset is a mapping from H to Z which maps cofinitely many objects to 0 (a sort of defaultdict(int)) === For mathematically minded people === Well, if one thinks about the mathematical meaning of multisets, then the collection of all multisets is the free Z-module over H. Therefore the meaningful operations for multisets are addition, subtraction and multiplication by a scalar. === For not so mathematically minded people === One way of thinking about a multiset can be as a vector whose dimensions are explicitly named, instead of being implicit. For example in 3D the vector [2, 3, 4] could instead be written as {2x, 3y, 4z}. Just like vectors, multisets can then be added, subtracted and multiplied by as scalar: {2x, 3y} + {4x, 7z} = {6x, 3y, 7z} {4x, 2y, 1z} - {1x, 2y, 3z} = {3x, -2z} {3x, 2y}*5 = {15x, 10y} Now if one wants to extend union and intersection to multisets: * For sets {x, y} union {y, z} = {x, y, z}. The natural way of extending this to multisets is having the union operator take the max of the multiplicities of each element, i.e. {2x, 7y, 1z} union {3y, 1z, 5t} = {2x, 7y, 1z, 5t} {3x, -5y} union {2z} = {3x, 2z} (!) * Similarly, for intersection one would take the min of multiplicities, i.e. {2x, 7y, 1z} intersection {3y, 1z, 5t} = {3y, 1z} {3x, -5y} intersection {2z} = {-5y} (!) * difference and symmetric_difference can be defined in terms of +, -, union, intersection: A difference B = A - (A intersection B) A symmetric_difference B = (A union B) - (A intersection B) Note that union, intersection, difference and symmetric_difference defined this way are fully compatible with the corresponding set operations in the following sense: If A and B are sets, then - multiset(A).union(multiset(B)) == multiset(A.union(B)) - multiset(A).intersection(multiset(B)) == multiset(A.intersection(B)) (same for difference, symmetric_difference) I think this is how I would like my multisets :) -- Arnaud -- http://mail.python.org/mailman/listinfo/python-list