I'm involved in a discussion thread in which it has been stated that: """ Anything written in a language that is > 20x slower (Perl, Python, PHP) than C/C++ should be instantly rejected by users on those grounds alone. """
I've challenged someone to beat the snippet of code below in C, C++, or assembler, for reading in one million pairs of random floats and sorting them by the second member of the pair. I'm not a master Python programmer. Is there anything I could do to make this even faster than it is? Also, if I try to write the resulting list of tuples back out to a gdbm file, it takes a good 14 seconds, which is far longer than the reading and sorting takes. The problem seems to be that the 'f' flag to gdbm.open() is being ignored and writes are being sync'd to disk either on each write, or on close. I'd really prefer to let the OS decide when to actually write to disk. I'm using python 2.5.2, libgdm 1.8.3, and python-gdbm 2.5.2 under Ubuntu 8.4 beta and an x86_64 architechture. Thanks for any tips. ===== import cPickle, gdbm, operator dbmIn = gdbm.open('float_pairs_in.pickel') print "Reading pairs..." pairs = cPickle.loads(dbmIn['pairs']) print "Sorting pairs..." pairs.sort(key=operator.itemgetter(1)) print "Done!" ===== The input file was created with this: ===== import random, gdbm, cPickle print "Creating pairs file..." pairs = [(random.random(), random.random(),) for pair in range(0,1000000)] dbmOut = gdbm.open('float_pairs_in.pickel', 'nf') dbmOut['pairs'] = cPickle.dumps(pairs, 2) dbmOut.close() print "Done!" ===== -- http://mail.python.org/mailman/listinfo/python-list