On Mon, 13 Dec 2010 18:50:38 -0800, gry wrote: > [python-2.4.3, rh CentOS release 5.5 linux, 24 xeon cpu's, 24GB ram] I > have a little data generator that I'd like to go faster... any > suggestions? > maxint is usually 9223372036854775808(max 64bit int), but could > occasionally be 99. > width is usually 500 or 1600, rows ~ 5000. > > from random import randint > > def row(i, wd, mx): > first = ['%d' % i] > rest = ['%d' % randint(1, mx) for i in range(wd - 1)] > return first + rest > ... > while True: > print "copy %s from stdin direct delimiter ',';" % table_name > for i in range(i,i+rows): > print ','.join(row(i, width, maxint)) > print '\.'
This isn't entirely clear to me. Why is the while loop indented? I assume it's part of some other function that you haven't shown us, rather than part of the function row(). Assuming this, I would say that the overhead of I/O (the print commands) will likely be tens or hundreds of times greater than the overhead of the loop, so you're probably not likely to see much appreciable benefit. You might save off a few seconds from something that runs for many minutes. I don't see the point, really. If the print statements are informative rather than necessary, I would print every tenth (say) line rather than every line. That should save *lots* of time. Replacing "while True" with "while 1" may save a tiny bit of overhead. Whether it is significant or not is another thing. Replacing range with xrange should also make a difference, especially if rows is a large number. Moving the code from row() inline, replacing string interpolation with calls to str(), may also help. Making local variables of any globals may also help a tiny bit. But as I said, you're shaving microseconds of overhead and spending millseconds printing -- the difference will be tiny. But for what it's worth, I'd try this: # Avoid globals in favour of locals. from random import randint _maxint = maxint loop = xrange(i, i+rows) # Where does i come from? inner_loop = xrange(width) # Note 1 more than before. while 1: print "copy %s from stdin direct delimiter ',';" % table_name for i in loop: row = [str(randint(1, _maxint)) for _ in inner_loop] row[0] = str(i) # replace in place print ','.join(row) print '\.' Hope it helps. -- Steven -- http://mail.python.org/mailman/listinfo/python-list