On 21 November 2013 13:01, John O'Hagan <resea...@johnohagan.com> wrote: > In my use-case the first argument to multicombs is a tuple of words > which may contain duplicates, and it produces all unique combinations > of a certain length of those words, eg: > > list(multicombs(('cat', 'hat', 'in', 'the', 'the'), 3)) > > [('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'in', 'the'), > ('cat', 'the', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'), > ('in', 'the', 'the')]
I still don't understand what you're actually doing well enough to know whether there is a better general approach to the problem. For the specific thing you requested, here is a recursive multiset combinations generator. Does it do what you wanted? #!/usr/bin/env python def multicombs(it, r): words = [] last = None for N, word in enumerate(it): if word == last: words[-1][1] += 1 else: words.append([word, 1]) last = word cumulative = 0 for n in range(len(words)-1, -1, -1): words[n].append(cumulative) cumulative += words[n][1] return _multicombs((), words, r) def _multicombs(prepend, words, r): if r == 0: yield prepend return (word, count, rem), *remaining = words for k in range(max(r-rem, 0), min(count, r) + 1): yield from _multicombs(prepend + (word,) * k, remaining, r-k) expected = [ ('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'in', 'the'), ('cat', 'the', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'), ('in', 'the', 'the'), ] output = list(multicombs(('cat', 'hat', 'in', 'the', 'the'), 3)) assert len(expected) == len(output) assert set(expected) == set(output) # The order is different Oscar -- https://mail.python.org/mailman/listinfo/python-list