Hi folks,

Thanks, for all the help. I tried running the various options, and
here is what I found:


from array import array
from time import time

def f1(recs, cols):
    for r in recs:
        for i,v in enumerate(r):
            cols[i].append(v)

def f2(recs, cols):
    for r in recs:
        for v,c in zip(r, cols):
            c.append(v)

def f3(recs, cols):
    for r in recs:
        map(list.append, cols, r)

def f4(recs):
    return zip(*recs)

records = [ tuple(range(10)) for i in xrange(1000000) ]

columns = tuple([] for i in xrange(10))
t = time()
f1(records, columns)
print 'f1: ', time()-t

columns = tuple([] for i in xrange(10))
t = time()
f2(records, columns)
print 'f2: ', time()-t

columns = tuple([] for i in xrange(10))
t = time()
f3(records, columns)
print 'f3: ', time()-t

t = time()
columns = f4(records)
print 'f4: ', time()-t

f1:  5.10132408142
f2:  5.06787180901
f3:  4.04700708389
f4:  19.13633203506

So there is some benefit in using map(list.append). f4 is very clever
and cool but it doesn't seem to scale.

Incidentally, it took me a while to figure out why the following
initialization doesn't work:
  columns = ([],)*10
apparently you end up with 10 copies of the same list.

Finally, in my case the output columns are integer arrays (to save
memory). I can still use array.append but it's a little slower so the
difference between f1-f3 gets even smaller. f4 is not an option with
arrays.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to