On Jan 19, 2009, at 3:12 AM, S.Selvam Siva wrote:

Hi all,

I am running a python script which parses nearly 22,000 html files locally
stored using BeautifulSoup.
The problem is the memory usage linearly increases as the files are being
parsed.
When the script has crossed parsing 200 files or so, it consumes all the available RAM and The CPU usage comes down to 0% (may be due to excessive
paging).

We tried 'del soup_object' and used 'gc.collect()'. But, no improvement.

Please guide me how to limit python's memory-usage or proper method for
handling BeautifulSoup object in resource effective manner

You need to figure out where the memory is disappearing. Try commenting out parts of your script. For instance, maybe start with a minimalist script: open and close the files but don't process them. See if the memory usage continues to be a problem. Then add elements back in, making your minimalist script more and more like the real one. If the extreme memory usage problem is isolated to one component or section, you'll find it this way.

HTH
Philip
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to