On Tuesday, February 4, 2014 7:36:48 PM UTC+5:30, Dennis Lee Bieber wrote: > On Tue, 4 Feb 2014 05:19:48 -0800 (PST), Ayushi Dalmia > > <ayushidalmia2...@gmail.com> declaimed the following: > > > > > > >I need to chunk out the outputs otherwise it will give Memory Error. I need > >to do some postprocessing on the data read from the file too. If I donot > >stop before memory error, I won't be able to perform any more operations on > >it. > > > > 10 200MB files is only 2GB... Most any 64-bit processor these days can > > handle that. Even some 32-bit systems could handle it (WinXP booted with > > the server option gives 3GB to user processes -- if the 4GB was installed > > in the machine). > > > > However, you speak of an n-way merge. The traditional merge operation > > only reads one record from each file at a time, examines them for "first", > > writes that "first", reads next record from the file "first" came from, and > > then reassesses the set. > > > > You mention needed to chunk the data -- that implies performing a merge > > sort in which you read a few records from each file into memory, sort them, > > and right them out to newFile1; then read the same number of records from > > each file, sort, and write them to newFile2, up to however many files you > > intend to work with -- at that point you go back and append the next chunk > > to newFile1. When done, each file contains chunks of n*r records. You now > > make newFilex the inputs, read/merge the records from those chunks > > outputting to another file1, when you reach the end of the first chunk in > > the files you then read/merge the second chunk into another file2. You > > repeat this process until you end up with only one chunk in one file. > > -- > > Wulfraed Dennis Lee Bieber AF6VN > > wlfr...@ix.netcom.com HTTP://wlfraed.home.netcom.com/
The way you mentioned for merging the file is an option but that will involve a lot of I/O operation. Also, I do not want the size of the file to increase beyond a certain point. When I reach the file size upto a certain limit, I want to start writing in a new file. This is because I want to store them in memory again later. -- https://mail.python.org/mailman/listinfo/python-list