On Jun 20, 4:48 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > I am using Python to process particle data from a physics simulation. > There are about 15 MB of data associated with each simulation, but > there are many simulations. I read the data from each simulation into > Numpy arrays and do a simple calculation on them that involves a few > eigenvalues of small matricies and quite a number of temporary > arrays. I had assumed that that generating lots of temporary arrays > would make my program run slowly, but I didn't think that it would > cause the program to consume all of the computer's memory, because I'm > only dealing with 10-20 MB at a time. > > So, I have a function that reliably increases the virtual memory usage > by ~40 MB each time it's run. I'm measuring memory usage by looking > at the VmSize and VmRSS lines in the /proc/[pid]/status file on an > Ubuntu (edgy) system. This seems strange because I only have 15 MB of > data. > > I started looking at the difference between what gc.get_objects() > returns before and after my function. I expected to see zillions of > temporary Numpy arrays that I was somehow unintentionally maintaining > references to. However, I found that only 27 additional objects were > in the list that comes from get_objects(), and all of them look > small. A few strings, a few small tuples, a few small dicts, and a > Frame object. > > I also found a tool called heapy (http://guppy-pe.sourceforge.net/) > which seems to be able to give useful information about memory usage > in Python. This seemed to confirm what I found from manual > inspection: only a few new objects are allocated by my function, and > they're small. > > I found Evan Jones article about the Python 2.4 memory allocator never > freeing memory in certain circumstances: > http://evanjones.ca/python-memory.html. > This sounds a lot like what's happening to me. However, his patch was > applied in Python 2.5 and I'm using Python 2.5. Nevertheless, it > looks an awful lot like Python doesn't think it's holding on to the > memory, but doesn't give it back to the operating system, either. Nor > does Python reuse the memory, since each successive call to my > function consumes an additional 40 MB. This continues until finally > the VM usage is gigabytes and I get a MemoryException. > > I'm using Python 2.5 on an Ubuntu edgy box, and numpy 1.0.3. I'm also > using a few routines from scipy 0.5.2, but for this part of the code > it's just the eigenvalue routines. > > It seems that the standard advice when someone has a bit of Python > code that progressively consumes all memory is to fork a process. I > guess that's not the worst thing in the world, but it certainly is > annoying. Given that others seem to have had this problem, is there a > slick package to do this? I envision: > value = call_in_separate_process(my_func, my_args) > > Suggestions about how to proceed are welcome. Ideally I'd like to > know why this is going on and fix it. Short of that workarounds that > are more clever than the "separate process" one are also welcome. > > Thanks, > Greg
I had almost the same problem. Will this do? http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/511474 Any comments are welcome (I wrote the recipe with Pythonistas' help). Regards, Muhammad Alkarouri -- http://mail.python.org/mailman/listinfo/python-list