Re: Manipulate Large Binary Files

George Sakkis Wed, 02 Apr 2008 10:13:39 -0700

On Apr 2, 11:50 am, Derek Martin <[EMAIL PROTECTED]> wrote:
> On Wed, Apr 02, 2008 at 10:59:57AM -0400, Derek Tracy wrote:
> > I generated code that works wonderfully for files under 2Gb in size
> > but the majority of the files I am dealing with are over the 2Gb
> > limit
>
> > ary = array.array('H', INPUT.read())
>
> You're trying to read the file all at once.  You need to break your
> reads up into smaller chunks, in a loop.  You're essentially trying to
> store more data in memory than your OS can actually access in a single
> process...
>
> Something like this (off the top of my head, I may have overlooked
> some detail, but it should at least illustrate the idea):
>
> # read a meg at a time
> buffsize = 1048576
> while true:
>         buff = INPUT.read(buffsize)
>         OUTPUT.write(buff)
>         if len(buff) != buffsize:
>                 break


Or more idiomatically:

from functools import partial
for buff in iter(partial(INPUT.read, 10 * 1024**2), ''):
    # process each 10MB buffer

George
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Manipulate Large Binary Files

Reply via email to