Shouldn't this >>> print re.sub('a','\\n','bab') b b
output b\nb instead? Massimo On Oct 16, 2007, at 1:34 AM, George Sakkis wrote: > On Oct 15, 11:02 pm, 7stud <[EMAIL PROTECTED]> wrote: >> I'm applying groupby() in a very simplistic way to split up some >> data, >> but when I timeit against another method, it takes twice as long. >> The >> following groupby() code groups the data between the "</tr>" strings: >> >> data = [ >> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>", >> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>", >> "1.5","</tr>","2.5","3.5","4.5","</tr>","</tr>","5.5","6.5","</tr>", >> ] >> >> import itertools >> >> def key(s): >> if s[0] == "<": >> return 'a' >> else: >> return 'b' >> >> def test3(): >> >> master_list = [] >> for group_key, group in itertools.groupby(data, key): >> if group_key == "b": >> master_list.append(list(group) ) >> >> def test1(): >> master_list = [] >> row = [] >> >> for elmt in data: >> if elmt[0] != "<": >> row.append(elmt) >> else: >> if row: >> master_list.append(" ".join(row) ) >> row = [] >> >> import timeit >> >> t = timeit.Timer("test3()", "from __main__ import test3, key, data") >> print t.timeit() >> t = timeit.Timer("test1()", "from __main__ import test1, data") >> print t.timeit() >> >> --output:--- >> 42.791079998 >> 19.0128788948 >> >> I thought groupby() would be faster. Am I doing something wrong? > > Yes and no. Yes, the groupby version can be improved a little by > calling a builtin method instead of a Python function. No, test1 still > beats it hands down (and with Psyco even further); it is almost good > as it gets in pure Python. > > FWIW, here's a faster and more compact version with groupby: > > def test3b(data): > join = ' '.join > return [join(group) for key,group in > itertools.groupby(data, "</tr>".__eq__) > if not key] > > > George > > -- > http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list