Robin Becker wrote:
Richard Brodie wrote:

"Robin Becker" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]

Gerald Klix wrote:

Map the file into RAM by using the mmap module.
The file's contents than is availabel as a seachable string.


that's a good idea, but I wonder if it actually saves on memory? I just tried
regexing through a 25Mb file and end up with 40Mb as working set (it rose
linearly as the loop progessed through the file). Am I actually saving anything
by not letting normal vm do its thing?



You aren't saving memory in that sense, no. If you have any RAM spare the
file will end up in it. However, if you are short on memory though, mmaping the
file gives the VM the opportunity to discard pages from the file, instead of paging
them out. Try again with a 25Gb file and watch the difference ;) YMMV.




:)

So we avoid dirty page writes etc etc. However, I still think I could get away with a small window into the file which would be more efficient.

I seem to remember that the Medusa code contains a fairly good overlapped search for a terminator string, if you want to chunk the file.


Take a look at the handle_read() method of class async_chat in the standard library's asynchat.py.

regards
 Steve
--
Steve Holden        +1 703 861 4237  +1 800 494 3119
Holden Web LLC             http://www.holdenweb.com/
Python Web Programming  http://pydish.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to