Python memory handling
Greets, I've some troubles getting my memory freed by python, how can I force it to release the memory ? I've tried del and gc.collect() with no success. Here is a code sample, parsing an XML file under linux python 2.4 (same problem with windows 2.5, tried with the first example) : #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared #Using http://www.pixelbeat.org/scripts/ps_mem.py to get memory information import cElementTree as ElementTree #meminfo: 2.3 Mb private, 1.6 Mb shared import gc #no memory change et=ElementTree.parse('primary.xml') #meminfo: 34.6 Mb private, 1.6 Mb shared del et #no memory change gc.collect() #no memory change So how can I free the 32.3 Mb taken by ElementTree ?? The same problem here with a simple file.readlines() #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared import gc #no memory change f=open('primary.xml') #no memory change data=f.readlines() #meminfo: 12 Mb private, 1.4 Mb shared del data #meminfo: 11.5 Mb private, 1.4 Mb shared gc.collect() # no memory change But works great with file.read() : #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared import gc #no memory change f=open('primary.xml') #no memory change data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared del data #meminfo: 1.1 Mb private, 1.4 Mb shared gc.collect() # no memory change So as I can see, python maintain a memory pool for lists. In my first example, if I reparse the xml file, the memory doesn't grow very much (0.1 Mb precisely) So I think I'm right with the memory pool. But is there a way to force python to release this memory ?! Regards, FP -- http://mail.python.org/mailman/listinfo/python-list
Re: Python memory handling
On 31 mai, 14:16, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > In <[EMAIL PROTECTED]>, frederic.pica > wrote: > > > So as I can see, python maintain a memory pool for lists. > > In my first example, if I reparse the xml file, the memory doesn't > > grow very much (0.1 Mb precisely) > > So I think I'm right with the memory pool. > > > But is there a way to force python to release this memory ?! > > AFAIK not. But why is this important as long as the memory consumption > doesn't grow constantly? The virtual memory management of the operating > system usually takes care that only actually used memory is in physical > RAM. > > Ciao, > Marc 'BlackJack' Rintsch Because I'm an adept of small is beautiful, of course the OS will swap the unused memory if needed. If I daemonize this application I will have a constant 40 Mb used, not yet free for others applications. If another application need this memory, the OS will have to swap and loose time for the other application... And I'm not sure that the system will swap first this unused memory, it could also swap first another application... AFAIK. And these 40 Mb are only for a 7 Mb xml file, what about parsing a big one, like 50 Mb ? I would have preferred to have the choice of manually freeing this unused memory or setting manually the size of the memory pool Regards, FP -- http://mail.python.org/mailman/listinfo/python-list
Re: Python memory handling
On 31 mai, 16:22, Paul Melis <[EMAIL PROTECTED]> wrote: > Hello, > > [EMAIL PROTECTED] wrote: > > I've some troubles getting my memory freed by python, how can I force > > it to release the memory ? > > I've tried del and gc.collect() with no success. > > [...] > > > > > The same problem here with a simple file.readlines() > > #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared > > import gc #no memory change > > f=open('primary.xml') #no memory change > > data=f.readlines() #meminfo: 12 Mb private, 1.4 Mb shared > > del data #meminfo: 11.5 Mb private, 1.4 Mb shared > > gc.collect() # no memory change > > > But works great with file.read() : > > #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared > > import gc #no memory change > > f=open('primary.xml') #no memory change > > data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared > > del data #meminfo: 1.1 Mb private, 1.4 Mb shared > > gc.collect() # no memory change > > > So as I can see, python maintain a memory pool for lists. > > In my first example, if I reparse the xml file, the memory doesn't > > grow very much (0.1 Mb precisely) > > So I think I'm right with the memory pool. > > > But is there a way to force python to release this memory ?! > > This is from the 2.5 series release notes > (http://www.python.org/download/releases/2.5.1/NEWS.txt): > > "[...] > > - Patch #1123430: Python's small-object allocator now returns an arena to >the system ``free()`` when all memory within an arena becomes unused >again. Prior to Python 2.5, arenas (256KB chunks of memory) were never >freed. Some applications will see a drop in virtual memory size now, >especially long-running applications that, from time to time, temporarily >use a large number of small objects. Note that when Python returns an >arena to the platform C's ``free()``, there's no guarantee that the >platform C library will in turn return that memory to the operating > system. >The effect of the patch is to stop making that impossible, and in > tests it >appears to be effective at least on Microsoft C and gcc-based systems. >Thanks to Evan Jones for hard work and patience. > > [...]" > > So with 2.4 under linux (as you tested) you will indeed not always get > the used memory back, with respect to lots of small objects being > collected. > > The difference therefore (I think) you see between doing an f.read() and > an f.readlines() is that the former reads in the whole file as one large > string object (i.e. not a small object), while the latter returns a list > of lines where each line is a python object. > > I wonder how 2.5 would work out on linux in this situation for you. > > Paul Hello, I will try later with python 2.5 under linux, but as far as I can see, it's the same problem under my windows python 2.5 After reading this document : http://evanjones.ca/memoryallocator/python-memory.pdf I think it's because list or dictionnaries are used by the parser, and python use an internal memory pool (not pymalloc) for them... Regards, FP -- http://mail.python.org/mailman/listinfo/python-list
Re: Python memory handling
On 31 mai, 17:29, "Josh Bloom" <[EMAIL PROTECTED]> wrote: > If the memory usage is that important to you, you could break this out > into 2 programs, one that starts the jobs when needed, the other that > does the processing and then quits. > As long as the python startup time isn't an issue for you. > > On 31 May 2007 04:40:04 -0700, [EMAIL PROTECTED] > > <[EMAIL PROTECTED]> wrote: > > Greets, > > > I've some troubles getting my memory freed by python, how can I force > > it to release the memory ? > > I've tried del and gc.collect() with no success. > > Here is a code sample, parsing an XML file under linux python 2.4 > > (same problem with windows 2.5, tried with the first example) : > > #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared > > #Usinghttp://www.pixelbeat.org/scripts/ps_mem.pyto get memory > > information > > import cElementTree as ElementTree #meminfo: 2.3 Mb private, 1.6 Mb > > shared > > import gc #no memory change > > > et=ElementTree.parse('primary.xml') #meminfo: 34.6 Mb private, 1.6 Mb > > shared > > del et #no memory change > > gc.collect() #no memory change > > > So how can I free the 32.3 Mb taken by ElementTree ?? > > > The same problem here with a simple file.readlines() > > #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared > > import gc #no memory change > > f=open('primary.xml') #no memory change > > data=f.readlines() #meminfo: 12 Mb private, 1.4 Mb shared > > del data #meminfo: 11.5 Mb private, 1.4 Mb shared > > gc.collect() # no memory change > > > But works great with file.read() : > > #Python interpreter memory usage : 1.1 Mb private, 1.4 Mb shared > > import gc #no memory change > > f=open('primary.xml') #no memory change > > data=f.read() #meminfo: 7.3Mb private, 1.4 Mb shared > > del data #meminfo: 1.1 Mb private, 1.4 Mb shared > > gc.collect() # no memory change > > > So as I can see, python maintain a memory pool for lists. > > In my first example, if I reparse the xml file, the memory doesn't > > grow very much (0.1 Mb precisely) > > So I think I'm right with the memory pool. > > > But is there a way to force python to release this memory ?! > > > Regards, > > FP > > > -- > >http://mail.python.org/mailman/listinfo/python-list Yes it's a solution, but I think it's not a good way, I did'nt want to use bad hacks to bypass a python specific problem. And the problem is everywhere, every python having to manage big files. I've tried xml.dom.minidom using a 66 Mb xml file => 675 Mb of memory that will never be freed. But that time I've got many unreachable object when running gc.collect() Using the same file with cElementTree took me 217 Mb, with no unreachable object. For me it's not a good behavior, it's not a good way to let the system swap this unused memory instead of freeing it. I think it's a really good idea to have a memory pool for performance reason, but why is there no 'free block' limit ? Python is a really really good language that can do many things in a clear, easier and performance way I think. It has always feet all my needs. But I can't imagine there is no good solution for that problem, by limiting the free block pool size or best, letting the user specify this limit and even better, letting the user completely freeing it (with also the limit manual specification) Like: import pool pool.free() pool.limit(size in megabytes) Why not letting the user choosing that, why not giving the user more flexibility ? I will try later under linux with the latest stable python Regards, FP -- http://mail.python.org/mailman/listinfo/python-list