On Fri, 22 Mar 2013 12:22:13 -0700, Michael Fogleman wrote: > I feel like Python ought to have a built-in to do this. Take a list of > items and turn them into a dictionary mapping keys to a list of items > with that key in common. > > It's easy enough to do: > > # using defaultdict > lookup = collections.defaultdict(list) > for item in items: > lookup[key(item)].append(item) > > # or, using plain dict > lookup = {} > for item in items: > lookup.setdefault(key(item), []).append(item)
That's pretty much the reason setdefault was invented. So, in a sense, there is a built-in for this. > But this is frequent enough of a use case that a built-in function would > be nice. I'm not so sure I agree it's a frequent use-case. I don't think I've ever needed to do it, or if I did, it was so rare and so long ago that I've forgotten it. > I could implement it myself, as such: > > def grouped(iterable, key): > result = {} > for item in iterable: > result.setdefault(key(item), []).append(item) > return result > > lookup = grouped(items, key) > > This is different than `itertools.groupby` in a few important ways. Why do you care about itertools.groupby? That does something completely different. It groups items that occur in *contiguous* groups, e.g. [1, 2, 3, 2, 2, 2, 3, 3, 4, 5, 5, 2, 2, 5] will be grouped into three separate groups of two: [1], [2], [3], [2, 2, 2], [3, 3], [4], [5, 5], [2, 2], [5] This is a feature of groupby. If you want to accumulate items regardless of where they occur, e.g. for the above: [1], [2, 2, 2, 2, 2, 2], [3, 3, 3], [4], [5, 5, 5] then there's no need to use groupby. > Some examples: > > > >>> items = range(10) > >>> grouped(items, lambda x: x % 2) > {0: [0, 2, 4, 6, 8], 1: [1, 3, 5, 7, 9]} > > >>> items = 'hello stack overflow how are you'.split() > >>> grouped(items, len) > {8: ['overflow'], 3: ['how', 'are', 'you'], 5: ['hello', 'stack']} > > Is there a better way? Looks perfectly fine to me. It's a five line helper function, it's readable and simple and clear. The only improvements I would make would be to give it a doc string describing what it does and showing some examples: def grouped(items, key): """Return a dict with items accumulated by key. >>> items = range(10) >>> grouped(items, lambda x: x % 2) {0: [0, 2, 4, 6, 8], 1: [1, 3, 5, 7, 9]} >>> items = 'hello stack overflow how are you'.split() >>> grouped(items, len) {8: ['overflow'], 3: ['how', 'are', 'you'], 5: ['hello', 'stack']} """ result = {} for item in iterable: result.setdefault(key(item), []).append(item) return result Now you have a nice, descriptive help string for when you call help(grouped). -- Steven -- http://mail.python.org/mailman/listinfo/python-list