Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
On Nov 29, 11:43 pm, Bearophile wrote: > Anyway, you may try a pure-Python2.x > implementation:http://suffixtree.googlecode.com/files/suffixtree-0.1.py Ouch, Bearie... 214 lines of code are there. Ok.. I tested it. Firstly I replaced all "print " with "pass##print " (aiming to avoid intensive pr

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread Bearophile
n00m: > I suspect that building of Suffix Tree would > be a big exec.time-consuming overhead In C/D/C++ there are ways to allocate memory in smarter ways, using pools, arenas, stacks, freelists, etc. With that, using a compact encoding of tree nodes (there are many different tricks to reduce spac

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
Tested both my codes against a random string of length = 1. === from random import choice s = '' for i in xrange(1): s += choice(('a','b','c','d','e','f')) === C++: ~0.28s Python: ~0.48s PS I suspect th

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread inhahe
On Sun, Nov 29, 2009 at 9:10 AM, n00m wrote: > > Even if Py by 4x *slower* -- it's still a perfect Ok for it (C# will > be > much (much) slower than Python). > > How do you figure? As far as I know C# is many, many times faster than Python. (i was disappointed to find out that even IronPython is

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
http://en.wikipedia.org/wiki/Suffix_tree Looks not very friendly appealing :-) -- http://mail.python.org/mailman/listinfo/python-list

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
On Nov 29, 3:15 pm, Bearophile wrote: > Maybe in your C++ code there's something that can be improved, this is > a 1:1 translation to D (V.1) language (using dlibs) and it's about 2.2 > times faster than the Psyco version:http://codepad.org/MQLj0ydB > Using a smarter usage of memory (that is avoid

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread Bearophile
n00m: > my home tests proved Python is a fellow fast beast of C++. > Quite unexpected (I expected Py would be by ~10 times slower). > PS > Both my codes are identical in their algorithms. > = > 0.016         0.0150001049042   <--- exec times Maybe in your C++ code ther

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread Bearophile
n00m: > My Py solution: > ... Cute. You can replace: a[ord(s[i])] += [i] With: a[ord(s[i])].append(i) If you want to micro-optimize this with Psyco may be a little faster: (lev >> 1) Than: lev // 2 But to increase speed it's usually better to reduce memory allocations. So you can try to find a w

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread Bearophile
n00m: > This worked out in 5.28s > Imo it's not that *much* slower > (of course, Psyco can't help here) There's no need of a chain here, so you can rewrite this: import itertools def subs(s): return len(set(itertools.chain( s[i:j] for i in xrange(len(s)) for j in xrang

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
This worked out in 5.28s Imo it's not that *much* slower (of course, Psyco can't help here) === import itertools def subs(s): return len(set(itertools.chain( s[i:j] for i in xrange(len(s)) for j in xrange(i, len(s)+1 - 1 from time i

Re: Number of distinct substrings of a string [continuation]

2009-11-29 Thread n00m
My Py solution: = import psyco psyco.full() def foo(s): n = len(s) s = s + ' ' a = [[] for i in xrange(128)] ans = 0 for i in xrange(n - 1, -1, -1): lev = 0 for st in xrange(len(a[ord(s[i])]) - 1, -1, -1):