On May 22, 8:51 am, "A.T.Hofkamp" <[EMAIL PROTECTED]> wrote: > On 2008-05-22, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote: > > > Hi, I wanted to know how cautious it is to do something like: > > > f = file("filename", "rb") > > f.read() > > > for a possibly huge file. When calling f.read(), and not doing > > anything with the return value, what is Python doing internally? Is it > > loading the content of the file into memory (regardless of whether it > > is discarding it immediately)? > > I am not a Python interpreter developer, but as user, yes I'd expect that to > happen. The method doesn't know you are not doing anything with its return > value. > > > In my case, what I'm doing is sending the return value through a > > socket: > > > sock.send(f.read()) > > > Is that gonna make a difference (memory-wise)? I guess I'm just > > concerned with whether I can do a file.read() for any file in the > > system in an efficient and memory-kind way, and with low overhead in > > general. (For one thing, I'm not loading the contents into a > > variable.) > > Doesn't matter. You allocate a string in which the contents is loaded (the > return value of 'f.read()', and you hand over (a reference to) that string to > the 'send()' method. > > Note that memory is allocated by data *values*, not by *variables* in Python > (they are merely references to values). > > > Not that I'm saying that loading a huge file into memory will horribly > > crash the system, but it's good to try to program in the safest way > > possibly. For example, if you try something like this in the > > Depends on your system, and your biggest file. > > At a 32 bit platform, anything bigger than about 4GB (usually already at > around > 3GB) will crash the program for the simple reason that you are running out of > address space to store bytes in. > > To fix, read and write blocks by specifying a block-size in the 'read()' call.
I see... Thanks for the reply. So what would be a good approach to solve that problem? The best I can think of is something like: MAX_BUF_SIZE = 100000000 # about 100 MBs f = file("filename", "rb") f.seek(0, 2) # relative to EOF length = f.tell() bPos = 0 while bPos < length: f.seek(bPos) bPos += sock.send(f.read(MAX_BUF_SIZE)) -- http://mail.python.org/mailman/listinfo/python-list