On Jun 30, 3:12 pm, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > On Mon, 30 Jun 2008 10:55:00 -0700, Tom Davis wrote: > > To me, this seems illogical. I can understand that the GC is > > reluctant to reclaim objects that have many connections to other > > objects and so forth, but once those objects' scopes are gone, why > > doesn't it force a reclaim? For instance, I can use timeit to create > > an object instance, run a method of it, then `del` the variable used > > to store the instance, but each loop thereafter continues to require > > more memory and take more time. 1000 runs may take .27 usec/pass > > whereas 100000 takes 2 usec/pass (Average). > > `del` just removes the name and one reference to that object. Objects are > only deleted when there's no reference to them anymore. Your example > sounds like you keep references to objects somehow that are accumulating. > Maybe by accident. Any class level bound mutables or mutable default > values in functions in that source code? Would be my first guess. > > Ciao, > Marc 'BlackJack' Rintsch
Marc, Thanks for the tips. A quick confirmation: I took "class level bound mutables" to mean something like: Class A(object): SOME_MUTABLE = [1,2] ... And "mutable default values" to mean: ... def a(self, arg=[1,2]): ... If this is correct, I have none of these. I understand your point about the references, but in my `timeit` example the statement is as simple as this: import MyClass a = MyClass() del a So, yes, it would seem that object references are piling up and not being removed. This is entirely by accident. Is there some kind of list somewhere that says "If your class has any of these attributes (mutable defaults, class-level mutables, etc.) it may not be properly dereferenced:"? My obvious hack around this is to only do X loops at a time and make a cron to run the script over and over until all the files have been processed, but I'd much prefer to make the code run as intended. I ran a test overnight last night and found that at first a few documents were handled per second, but when I woke up it had slowed down so much that it took over an hour to process a single document! The RAM usage went from 20mb at the start to over 300mb when it should actually never use more than about 20mb because everything is handled with local variables and new objects are instantiated for each document. This is a serious problem. Thanks, Tom -- http://mail.python.org/mailman/listinfo/python-list