On Tuesday, February 4, 2014 7:36:48 PM UTC+5:30, Dennis Lee Bieber wrote:
> On Tue, 4 Feb 2014 05:19:48 -0800 (PST), Ayushi Dalmia
> 
> <ayushidalmia2...@gmail.com> declaimed the following:
> 
> 
> 
> 
> 
> >I need to chunk out the outputs otherwise it will give Memory Error. I need 
> >to do some postprocessing on the data read from the file too. If I donot 
> >stop before memory error, I won't be able to perform any more operations on 
> >it.
> 
> 
> 
>       10 200MB files is only 2GB... Most any 64-bit processor these days can
> 
> handle that. Even some 32-bit systems could handle it (WinXP booted with
> 
> the server option gives 3GB to user processes -- if the 4GB was installed
> 
> in the machine).
> 
> 
> 
>       However, you speak of an n-way merge. The traditional merge operation
> 
> only reads one record from each file at a time, examines them for "first",
> 
> writes that "first", reads next record from the file "first" came from, and
> 
> then reassesses the set.
> 
> 
> 
>       You mention needed to chunk the data -- that implies performing a merge
> 
> sort in which you read a few records from each file into memory, sort them,
> 
> and right them out to newFile1; then read the same number of records from
> 
> each file, sort, and write them to newFile2, up to however many files you
> 
> intend to work with -- at that point you go back and append the next chunk
> 
> to newFile1. When done, each file contains chunks of n*r records. You now
> 
> make newFilex the inputs, read/merge the records from those chunks
> 
> outputting to another file1, when you reach the end of the first chunk in
> 
> the files you then read/merge the second chunk into another file2. You
> 
> repeat this process until you end up with only one chunk in one file.
> 
> -- 
> 
>       Wulfraed                 Dennis Lee Bieber         AF6VN
> 
>     wlfr...@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

The way you mentioned for merging the file is an option but that will involve a 
lot of I/O operation. Also, I do not want the size of the file to increase 
beyond a certain point. When I reach the file size upto a certain limit, I want 
to start writing in a new file. This is because I want to store them in memory 
again later.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to