Jack Diederich wrote: > On Thu, Feb 09, 2006 at 10:23:12AM -0800, [EMAIL PROTECTED] wrote: > > Jack Diederich wrote: > > > On Wed, Feb 08, 2006 at 12:50:19PM -0800, Swroteb wrote: > > > > Hi there, > > > > > > > > I've got a reasonably sized list of objects that I'd like to pull out > > > > all combinations of five elements from. Right now I have a way to do > > > > this that's quite slow, but manageable. I know there must be a better > > > > way to do this, but I'm not sure what it is. Here's what I've got so > > > > far: > > > > > > > > > > import probstat # http://probstat.sourceforge.net > > > for (a, b, c) in probstat.Combination(range(9), 3): # 0..9, pick 3 > > > print a, b, c > > > > > > It is a C extension that does permutations & combinations and is > > > about 10x faster than doing it in pure python [I'm the author]. > > > It is also the 3rd result for "python combination" and 5th for > > > "python permutaiton" but every month someone posts to c.l.py asking > > > for something similar. Go figure. > > > > Did you ever figure that some people use Windows? > > I don't have a windows dev box so I can't vouch for this binary but > a user sent me a windows .pyd yesterday. > > http://jackdied.com/static/probstat.pyd > > -jackdied
Hey, thanks! I have a pure Python permutation generator that generates a Cartesian product of a string with itself with filters to emulate permutation w/ replacemment combination w/ replacemment permutation w/o replacemment combination w/o replacemment so I put together a little test program to see how probstat compares. Correspondence to my emulations is (as far as I can tell) permutation w/ replacemment: equivalent to probstat.Cartesian combination w/ replacemment: no equivalent permutation w/o replacemment: equivalent to probstat.Permutation combination w/o replacemment: equivalent to probstat.Combination Here's the test program: ----------------------------------------------------- import probstat import time def cxxx(m): return '(' + ' and '.join(['(i%s!=i%s)' % (m,i) for i in range(m)]) + ')' def ooloop5(a, n, perm=True, repl=True): if (not repl) and (n>len(a)): return p = [] loop = '\n'.join([(' ' * i) + 'for i%s in a:' % i for i in range(n)]) + '\n' indt = ' ' * n sub1 = indt + 'if perm and repl:\n' sub2 = indt + 'if (not perm) and repl:\n' ccc2 = ' and '.join(['(i%s>=i%s)' % (i,i-1) for i in range(1,n)]) con2 = indt + ' if ' + ccc2 + ':\n' sub3 = indt + 'if perm and (not repl):\n' cccx = ' and '.join([cxxx(m) for m in range(1,n)]) con3 = indt + ' if ' + cccx + ':\n' sub4 = indt + 'if (not perm) and (not repl):\n' ccc4 = ' and '.join(['(i%s>i%s)' % (i,i-1) for i in range(1,n)]) con4 = indt + ' if ' + ccc4 + ':\n' bod1 = indt + ' s = ' + '+'.join(['i%s' % i for i in range(n)]) + '\n' bod2 = indt + ' p.append(s)\n' bod3 = indt + ' s = ' + '+'.join(['i%s' % i for i in range(n)]) + '\n' bod4 = indt + ' p.append(s)\n' e = loop + sub1 + bod1 + bod2 + sub2 + con2 + bod3 + bod4 + sub3 + con3 + bod3 + bod4 + sub4 + con4 + bod3 + bod4 exec e return p def p_cart(a,n): A = list(a) c = probstat.Cartesian([A for i in range(n)]) p = [] for i in c: p.append(''.join(i)) return p def p_perm(a,n): A = list(a) t0 = time.time() if len(A)<n: c = probstat.Permutation(A,n) else: c = probstat.Permutation(A) print time.time()-t0 p = [] for i in c: p.append(''.join(i)) return p def p_comb(a,n): A = list(a) c = probstat.Combination(A,n) p = [] for i in c: p.append(''.join(i)) return p print 'permutation w/ replacemment' p = ooloop5("abc", 3, True, True) print p print print 'combination w/ replacemment' p = ooloop5("abc", 3, False, True) print p print print 'permutation w/o replacemment' p = ooloop5("abc", 3, True, False) print p print print 'combination w/o replacemment' p = ooloop5("abc", 3, False, False) print p print print 'probstat.Cartesian' p = p_cart('abc',3) print p print print 'probstat.Permutation' p = p_perm('abc',3) print p print print 'probstat.Combination' p = p_comb('abc',3) print p print print 'permutation w/ replacemment "abcdefghijklmnopqrstuvwxyz":4' t0 = time.time() p = ooloop5("abcdefghijklmnopqrstuvwxyz", 4, True, True) t1 = time.time() print len(p),'4-letter words',t1-t0,'seconds' print 'combination w/ replacemment "abcdefghijklmnopqrstuvwxyz":4' t0 = time.time() p = ooloop5("abcdefghijklmnopqrstuvwxyz", 4, False, True) t1 = time.time() print len(p),'4-letter words',t1-t0,'seconds' print 'permutation w/o replacemment "abcdefghijklmnopqrstuvwxyz":4' t0 = time.time() p = ooloop5("abcdefghijklmnopqrstuvwxyz", 4, True, False) t1 = time.time() print len(p),'4-letter words',t1-t0,'seconds' print 'combination w/o replacemment "abcdefghijklmnopqrstuvwxyz":4' t0 = time.time() p = ooloop5("abcdefghijklmnopqrstuvwxyz", 4, False, False) t1 = time.time() print len(p),'4-letter words',t1-t0,'seconds' print print 'probstat.Cartesian "abcdefghijklmnopqrstuvwxyz":4' t0 = time.time() p = p_cart('abcdefghijklmnopqrstuvwxyz',4) t1 = time.time() print len(p),'4-letter words',t1-t0,'seconds' print 'probstat.Permutation "abcdefghijklmnopqrstuvwxyz":4' t0 = time.time() p = p_perm('abcdefghijklmnopqrstuvwxyz',4) t1 = time.time() print len(p),'4-letter words',t1-t0,'seconds' print 'probstat.Combination "abcdefghijklmnopqrstuvwxyz":4' t0 = time.time() p = p_comb('abcdefghijklmnopqrstuvwxyz',4) t1 = time.time() print len(p),'4-letter words',t1-t0,'seconds' ----------------------------------------------------- The short tests worked out fine permutation w/ replacemment ['aaa', 'aab', 'aac', 'aba', 'abb', 'abc', 'aca', 'acb', 'acc', 'baa', 'bab', 'bac', 'bba', 'bbb', 'bbc', 'bca', 'bcb', 'bcc', 'caa', 'cab', 'cac', 'cba', 'cbb', 'cbc', 'cca', 'ccb', 'ccc'] combination w/ replacemment ['aaa', 'aab', 'aac', 'abb', 'abc', 'acc', 'bbb', 'bbc', 'bcc', 'ccc'] permutation w/o replacemment ['abc', 'acb', 'bac', 'bca', 'cab', 'cba'] combination w/o replacemment ['abc'] probstat.Cartesian ['aaa', 'baa', 'caa', 'aba', 'bba', 'cba', 'aca', 'bca', 'cca', 'aab', 'bab', 'cab', 'abb', 'bbb', 'cbb', 'acb', 'bcb', 'ccb', 'aac', 'bac', 'cac', 'abc', 'bbc', 'cbc', 'acc', 'bcc', 'ccc'] probstat.Permutation ['abc', 'acb', 'bac', 'bca', 'cab', 'cba'] probstat.Combination ['abc'] ----------------------------------------------------- Unfortunately, the long test died (out of virtual memory) executing the probstat.Permution test. But here's the strange thing. When executed from the Idle prompt, it works fine. >>> import probstat >>> A = list('abcdefghijklmnopqrstuvwxyz') >>> c = probstat.Permutation(A,4) >>> print len(c) 358800 >>> for i in c[:10]:print i ['a', 'b', 'c', 'd'] ['a', 'b', 'd', 'c'] ['a', 'c', 'b', 'd'] ['a', 'c', 'd', 'b'] ['a', 'd', 'b', 'c'] ['a', 'd', 'c', 'b'] ['b', 'a', 'c', 'd'] ['b', 'a', 'd', 'c'] ['b', 'c', 'a', 'd'] ['b', 'c', 'd', 'a'] >>> p = [] >>> for i in c: p.append(''.join(i)) >>> >>> for i in p[:10]:print i abcd abdc acbd acdb adbc adcb bacd badc bcad bcda ----------------------------------------------------- And if I comment out the Permutation test, everything else works ok. permutation w/ replacemment "abcdefghijklmnopqrstuvwxyz":4 456976 4-letter words 1.23500013351 seconds combination w/ replacemment "abcdefghijklmnopqrstuvwxyz":4 23751 4-letter words 0.68799996376 seconds permutation w/o replacemment "abcdefghijklmnopqrstuvwxyz":4 358800 4-letter words 1.67199993134 seconds combination w/o replacemment "abcdefghijklmnopqrstuvwxyz":4 14950 4-letter words 0.59299993515 seconds probstat.Cartesian "abcdefghijklmnopqrstuvwxyz":4 456976 4-letter words 1.42200016975 seconds probstat.Combination "abcdefghijklmnopqrstuvwxyz":4 14950 4-letter words 0.0780000686646 seconds ----------------------------------------------------- But it fails if I call it from inside a function. >>> import probstat >>> import time >>> def p_perm(a,n): A = list(a) if len(A)<n: c = probstat.Permutation(A,n) else: c = probstat.Permutation(A) p = [] for i in c: p.append(''.join(i)) return p >>> p = p_perm('abcdefghijklmnopqrstuvwxyz',4) Crash! Out of virtual memory. ----------------------------------------------------- I noticed that the probstat call takes virtually no time, most of it taken by the p.append loop. >>> def p_perm(a,n): A = list(a) if len(A)<n: c = probstat.Permutation(A,n) else: c = probstat.Permutation(A) print 'probstat finished' p = [] for i in c: p.append(''.join(i)) return p >>> p = p_perm('abcdefghijklmnopqrstuvwxyz',4) probstat finished Aha! Must be dying during the p.append loop. ----------------------------------------------------- So we'll skip the append for now. def p_perm(a,n): A = list(a) t0 = time.time() if len(A)<n: q = probstat.Permutation(A,n) else: q = probstat.Permutation(A) print time.time()-t0 #p = [] #for i in c: # p.append(''.join(i)) #return p print len(q) for i in q[:10]: print i return q probstat.Permutation "abcdefghijklmnopqrstuvwxyz":4 0.0 -1853882368 ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'z', 'y'] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'y', 'x', 'z'] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'y', 'z', 'x'] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'z', 'x', 'y'] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'z', 'y', 'x'] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'x', 'w', 'y', 'z'] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'x', 'w', 'z', 'y'] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'x', 'y', 'w', 'z'] ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'x', 'y', 'z', 'w'] -1853882368 4-letter words 0.25 seconds HUH??? No wonder the append loop has convulsions! and why are the lists length 26? s/b len 4. And yet... >>> q = probstat.Permutation(list('abcdefghijklmnopqrstuvwxyz'),4) >>> print len(q) 358800 >>> for i in q[:10]:print i ['a', 'b', 'c', 'd'] ['a', 'b', 'd', 'c'] ['a', 'c', 'b', 'd'] ['a', 'c', 'd', 'b'] ['a', 'd', 'b', 'c'] ['a', 'd', 'c', 'b'] ['b', 'a', 'c', 'd'] ['b', 'a', 'd', 'c'] ['b', 'c', 'a', 'd'] ['b', 'c', 'd', 'a'] ----------------------------------------------------- That's one of the strangest problems I've ever seen. I don't know if this has anything to do with the pyd file, but I figured you would like to know. -- http://mail.python.org/mailman/listinfo/python-list