On Thursday, November 21, 2013 5:01:15 AM UTC-8, John O'Hagan wrote: > On Thu, 21 Nov 2013 11:42:49 +0000 > > Oscar Benjamin wrote: > > > > > On 21 November 2013 06:46, John O'Hagan > > > wrote: > > > > > > > > I found a verbal description of such an algorithm and came up with > > > > this: > > > > > > > > def multicombs(it, r): > > > > result = it[:r] > > > > yield result > > > > while 1: > > > > for i in range(-1, -r - 1, -1): > > > > rep = result[i] > > > > if rep < it[i]: > > > > break > > > > else: > > > > break > > > > for j, n in enumerate(it): > > > > if n > rep: > > > > break > > > > result = result[:i] + it[j:j - i] > > > > yield result > > > > > > I'm not really sure what it is you're asking for. I thought if I ran > > > the code I'd understand but that just confused me more. Is the output > > > below correct? If not what should it be? > > > > > > multicombs("abracadabra", 0) > > > [''] > > > multicombs("abracadabra", 1) > > > ['a'] > > > multicombs("abracadabra", 2) > > > ['ab', 'br', 'ra'] > > > multicombs("abracadabra", 3) > > > ['abr', 'ara', 'bra'] > > > multicombs("abracadabra", 4) > > > ['abra'] > > > multicombs("abracadabra", 5) > > > ['abrac', 'abrbr', 'abrra', 'braca', 'brara', 'brbra', 'racad', > > > 'racbr', 'racra'] > > > > > > I neglected to mention that multicombs takes a sorted iterable; > > it doesn't work right otherwise. I'd forgotten that because my > > wordlists are guaranteed sorted by the way they're built. Sorry about > > that. > > > > In my use-case the first argument to multicombs is a tuple of words > > which may contain duplicates, and it produces all unique combinations > > of a certain length of those words, eg: > > > > list(multicombs(('cat', 'hat', 'in', 'the', 'the'), 3)) > > > > [('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'in', 'the'), > > ('cat', 'the', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'), > > ('in', 'the', 'the')] > > > > Contrast this with: > > > > list(itertools.combinations(('cat', 'hat', 'in', 'the', 'the'), 3)) > > > > [('cat', 'hat', 'in'), ('cat', 'hat', 'the'), ('cat', 'hat', 'the'), > > ('cat', 'in', 'the'), ('cat', 'in', 'the'), ('cat', 'the', 'the'), > > ('hat', 'in', 'the'), ('hat', 'in', 'the'), ('hat', 'the', 'the'), > > ('in', 'the', 'the')] > > > > which produces results which are redundant for my purposes. > > > > What I'm looking for is a recursive algorithm which does what > > multicombs does (order unimportant) so that I can apply a pruning > > shortcut like the one I used in the recursive cartesian product > > algorithm in my original post. > > > > Multiset combination algorithms seem pretty thin on the ground out > > there - as I said, I could only find a description of the procedure > > above, no actual code. The ones I did find are non-recursive. I'm > > hoping some combinatorics and/or recursion experts can offer advice. > > > > Regards, > > > > -- > > > > John
Could convert the following perl script to python? use Data::Dump qw(dump); dump combo([@ARGV], 3); sub combo { my ($t, $k) = @_; my @T = @$t; my @R = (); my %g = (); if ($k == 1) { for (@T) { push @R, $_ unless $g{$_}++; } } else { while (my $x = shift @T) { $p = combo([@T], $k-1); for (@{$p}) { my $q = $x.",".$_; push @R, $q unless $g{$q}++; } } } [@R]; } $ prog.pl cat hat in the the [ "cat,hat,in", "cat,hat,the", "cat,in,the", "cat,the,the", "hat,in,the", "hat,the,the", "in,the,the", ] James -- https://mail.python.org/mailman/listinfo/python-list