On Mar 18, 1:40 pm, [EMAIL PROTECTED] (Alex Martelli) wrote: > [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > On 3/18/07, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > > > In <[EMAIL PROTECTED]>, Daniel Nogradi > > > wrote: > > > > >> f = open('file.txt','r') > > > >> for line in f: > > > >> db[line.split(' ')[0]] = line.split(' ')[-1] > > > >> db.sync() > > > > > What is db here? Looks like a dictionary but that doesn't have a sync > > > >method. > > > > Shelves (`shelve` module) have this API. And syncing forces the changes > > > to be written to disks, so all caching and buffering of the operating > > > system is prevented. So this may slow down the program considerably. > > > It is a handle for bsddb > > > import bsddb > > db=bsddb.hashopen('db_filename') > > Syncing will defenitely slow down. I will slow that down. But is there > > any improvement I can do to the other part the splitting and setting > > the key value/pair? > > Unless each line is huge, how exactly you split it to get the first and > last blank-separated word is not going to matter much. > > Still, you should at least avoid repeating the splitting twice, that's > pretty obviously sheer waste: so, change that loop body to: > > words = line.split(' ') > db[words[0]] = words[-1] > > If some lines are huge, splitting them entirely may be far more work > than you need. In this case, you may do two partial splits instead, one > direct and one reverse: > > first_word = line.split(' ', 1)[0] > last_word = line.rsplit(' ', 1][-1] > db[first_word] = last_word
I'd guess the following is in theory faster, though it might not make a measurable difference: first_word = line[:line.index(' ')] last_word = line[line.rindex(' ')+1:] db[first_word] = last_word By the way, a gotcha is that the file iterator yields lines that retain the newline character; you have to strip it off if you don't want it, either with .rstrip('\n') or (at least on *n*x) omit the last character: last_word = line[line.rindex(' ')+1 : -1] George -- http://mail.python.org/mailman/listinfo/python-list