S.Selvam Siva wrote:
Hi all,

I am running a python script which parses nearly 22,000 html files locally stored using BeautifulSoup. The problem is the memory usage linearly increases as the files are being parsed. When the script has crossed parsing 200 files or so, it consumes all the available RAM and The CPU usage comes down to 0% (may be due to excessive paging).

I have to guess that you are somehow holding on to data associated with each file.

We tried 'del soup_object'  and used 'gc.collect()'. But, no improvement.

'del ob' only deletes the association between name 'ob' and the object it was associated with. The object itself cannot disappear until all associations are gone.

gc.collect only deletes circularly associated objects that collectively are isolated.

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to