On Thu, 21 Nov 2013 11:42:49 +0000 Oscar Benjamin <oscar.j.benja...@gmail.com> wrote:
> On 21 November 2013 06:46, John O'Hagan <resea...@johnohagan.com> > wrote: > > > > I found a verbal description of such an algorithm and came up with > > this: > > > > def multicombs(it, r): > > result = it[:r] > > yield result > > while 1: > > for i in range(-1, -r - 1, -1): > > rep = result[i] > > if rep < it[i]: > > break > > else: > > break > > for j, n in enumerate(it): > > if n > rep: > > break > > result = result[:i] + it[j:j - i] > > yield result > > I'm not really sure what it is you're asking for. I thought if I ran > the code I'd understand but that just confused me more. Is the output > below correct? If not what should it be? > > multicombs("abracadabra", 0) > [''] > multicombs("abracadabra", 1) > ['a'] > multicombs("abracadabra", 2) > ['ab', 'br', 'ra'] > multicombs("abracadabra", 3) > ['abr', 'ara', 'bra'] > multicombs("abracadabra", 4) > ['abra'] > multicombs("abracadabra", 5) > ['abrac', 'abrbr', 'abrra', 'braca', 'brara', 'brbra', 'racad', > 'racbr', 'racra'] I neglected to mention that multicombs takes a sorted iterable; it doesn't work right otherwise. I'd forgotten that because my wordlists are guaranteed sorted by the way they're built. Sorry about that. In my use-case the first argument to multicombs is a tuple of words which may contain duplicates, and it produces all unique combinations of a certain length of those words, eg: list(multicombs(('cat', 'hat', 'in', 'the', 'the'), 3)) [('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'in', 'the'), ('cat', 'the', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'), ('in', 'the', 'the')] Contrast this with: list(itertools.combinations(('cat', 'hat', 'in', 'the', 'the'), 3)) [('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'hat', 'the'), ('cat', 'in', 'the'), ('cat', 'in', 'the'), ('cat', 'the', 'the'), ('hat', 'in', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'), ('in', 'the', 'the')] which produces results which are redundant for my purposes. What I'm looking for is a recursive algorithm which does what multicombs does (order unimportant) so that I can apply a pruning shortcut like the one I used in the recursive cartesian product algorithm in my original post. Multiset combination algorithms seem pretty thin on the ground out there - as I said, I could only find a description of the procedure above, no actual code. The ones I did find are non-recursive. I'm hoping some combinatorics and/or recursion experts can offer advice. Regards, -- John -- https://mail.python.org/mailman/listinfo/python-list