On 10/24/2010 8:39 PM, kj wrote:
I'm designing a system that will be very memory hungry unless it
is "garbage-collected" very aggressively.
In the past I have had disappointing results with the gc module:
I noticed practically no difference in memory usage with and without
it. It is possible, however, that I was not measuring memory
consumption adequately.
What's the most accurate way to monitor memory consumption in a
Python program, and thereby ensure that gc is working properly?
Also, are there programming techniques that will result in better
garbage collection? For example, would it help to explicitly call
del on objects that should be gc'd?
Python the language is not much concerned with memory. For an
interpreter running on a computer, there are four memory sizes to be
considered: the virtual memory assigned to the process; the physical
memory assigned to the process; the physical memory used by Python
objects; and the the memory used by 'active' objects accessible from
program code. As far as I know, the OS can only see and report on the
first and/or second.
If the gap between the second and third (assigned and used physical
memory) is large and includes 'blocks' that are totally unused, the
interpreter *may* be able to return such blocks. But do not count on it.
When this gap expands because the program deletes objects without
returning blocks, people get fooled by OS reports of assigned memory not
shrinking (even though used memory is).
CPython tries to minimize the gap between all objects and active object
both with reference counting and cyclic garbage collection (gc). Yes,
you can help this along by judicious use of del and gc.collect. The goal
should be to minimize the maximum active memory size. Reusing large
arrays (rather than deleting and creating a new one) can sometimes help
by avoiding fragmentation of allocated memory.
Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list