This may be relevant to the question: http://bugs.python.org/issue6653
'Potential memory leak in multiprocessing' Opened yesterday. On Aug 5, 9:07 pm, Ahmed Fasih <wuzzyv...@gmail.com> wrote: > I started writing this post asking for assistance on using Heapy for > studying my Sage application's memory usage. But I think that a more > important question relates to my use of @parallel('multiprocessing'). > So here goes. > > Section 1. > On an 8-core machine, I kick off a function (which has been decorated > with @parallel('multiprocessing')) with 16 arguments. (BTW, Sage > outrocks all---the sirens of multicore have given Matlab-addicted > colleagues weeks of impotent rage; me, only a couple of hours of > searching and finding "parallel?". The creators of Sage will have a > special place in heaven.) > > Anyway, 16 arguments are kicked to this parallel function (which > creates large vectors and reduces them to a scalar), I see 8 or so > sage processes in "top", but one of them stands out: it's memory > footprint steadily increases in small 1-3% increments to about 12% of > memory. It stays at 12% after the sage script finishes. If I kicked > off 100 parallel runs, I get MemoryErrors due to excessive RAM use > (and Sage doesn't recover---it hangs, and Control-C kicks off > KeyboardErrors but never cleanly goes down, I have to kill the > screen). > > But if I decorate my function with @parallel('reference'), that is, to > force serial execution, the Sage process at the end of many parallel > runs only takes 1.5% memory. (It just takes a lot more time.) > > Is there an issue with @parallel with multiprocessing and garbage > collection? Why would Sage sop up so much memory after parallel runs > of a function that solely returns a scalar? (The function does take as > its input a small compound object, deepcopy's it, modifies the copy, > generates large vectors from the original and modified objects and > reduces them to a scalar, which it returns. I haven't tried making > Python explicitly garbage-collect the intermediate variables it > generates because that sounds risky.) > > Section 2. > In a bid to figure out what Python objects are taking so much RAM, I > installed Guppy and am trying to use Heapy (http://guppy- > pe.sourceforge.net/heapy_tutorial.html). I cannot understand its > output, and "from guppy import hpy; hp=hpy(); hp.heap()" tells me that > my 12% memory usage (of 8 GB RAM) is using just 40 MB of memory. If > anyone knows how to interpret it's output, I'd be much obliged: > > sage: hp.heap() > > Partition of a set of 390118 objects. Total size = 42345660 bytes. > Index Count % Size % Cumulative % Kind (class / dict of > class) > 0 168567 43 24633284 58 24633284 58 str > 1 93441 24 3740796 9 28374080 67 tuple > 2 1593 0 2051016 5 30425096 72 dict of module > 3 14869 4 2022184 5 32447280 77 dict of > numpy.core.defmatrix.matrix > 4 25218 6 1714824 4 34162104 81 types.CodeType > 5 24449 6 1369144 3 35531248 84 function > 6 2518 1 1281712 3 36812960 87 dict of type > 7 2301 1 1167912 3 37980872 90 dict (no owner) > 8 2520 1 1094184 3 39075056 92 type > 9 14869 4 832664 2 39907720 94 > numpy.core.defmatrix.matrix > <839 more rows. Type e.g. '_.more' to view.> > > Thanks very much! --~--~---------~--~----~------------~-------~--~----~ To post to this group, send an email to sage-devel@googlegroups.com To unsubscribe from this group, send an email to sage-devel-unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sage-devel URLs: http://www.sagemath.org -~----------~----~----~----~------~----~------~--~---