On 6/20/2011 10:59 PM, king6c...@gmail.com wrote: > Hi, > I have two large files,each has more than 200000000 lines,and each line > consists of two fields,one is the id and the other a value, > the ids are sorted. > > for example: > > file1 > (uin_a y) > 1 10000245 > 2 12333 > 3 324543 > 5 3464565 > .... > > > file2 > (uin_b gift) > 1 34545 > 3 6436466 > 4 35345646 > 5 463626 > .... > > I want to merge them and get a file,the lines of which consists of an id > and the sum of the two values in file1 and file2。 > the codes are as below:
One minor thing you can do is use bound methods > > uin_y=open('file1') > uin_gift=open(file2') ynext = open('file1').next gnext = open(file1').next > y_line=uin_y.next() > gift_line=uin_gift.next() y_line = ynext() gift_list = gnext() and similarly for all .next appearances in what follows. > while 1: > try: > uin_a,y=[int(i) for i in y_line.split()] This creates an unnecessary second temporary list. Unroll the loop. pair = y_line.split uin_a = int(pair[0]) y = int(pair[1]) > uin_b,gift=[int(i) for i in gift_line.split()] same for this line > if uin_a==uin_b: > score=y+gift > print uin_a,score > y_line=uin_y.next() > gift_line=uin_gift.next() > if uin_a<uin_b: > print uin_a,y > y_line=uin_y.next() > if uin_a>uin_b: > print uin_b,gift > gift_line=uin_gift.next() > except StopIteration: > break -- Terry Jan Reedy -- http://mail.python.org/mailman/listinfo/python-list