Re: groupby() seems slow

Raymond Hettinger Tue, 16 Oct 2007 13:08:07 -0700

On Oct 15, 8:02 pm, 7stud <[EMAIL PROTECTED]> wrote:
> t = timeit.Timer("test3()", "from __main__ import test3, key, data")
> print t.timeit()
> t = timeit.Timer("test1()", "from __main__ import test1, data")
> print t.timeit()
>
> --output:---
> 42.791079998
> 19.0128788948
>
> I thought groupby() would be faster.  Am I doing something wrong?


The groupby() function is not where you are losing speed.  In test1,
you've in-lined the code for computing the key.  In test3, groupby()
makes expensive, repeated calls to a pure python key function.   For
an apples-to-apples comparison, try something like this:

def test4():
    master_list = []
    row = []
    for elem in data:
        if key(elem) == 'a':
            row.append(elem)
        elif row:
            master_list.append(' '.join(row))
            del row[:]


Raymond


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: groupby() seems slow

Reply via email to