Hello,

I have a surprisingly simple bit of code, injecting data into a database
via django's ORM. The "Hit" table has 1.5 million records. The problem
is, as the loop runs, more and more memory is consumed until my machine
starts thrashing on swap. the first 400,000 records finishes in 5
minutes. Then the next 10,000 take over 30 minutes! As far as I can
tell, when the 'hit' variable drops out of scope, its ref count should
go to 0 and it should be garbage collected. It appears it is not, as
memory usage gradually grows and grows over the loop, bringing the
machine to its knees. Here is the code snippet:

    print "filling hits identities percent..."
    
    total = Hit.objects.all().count()
    
    for n, hit in enumerate(Hit.objects.all()):
        print "%d/%d..."%(n,total),
        sys.stdout.flush()
        
        hit.identities_percent =
int(hit.deprecated_identities_percent())
        hit.save()
        print "done"
        
    print "all done"

What am I missing? Surely each "hit" object should be garbage collected
at the end of the innermost block? Is django holding onto these objects
internally? Why is it consuming so much mega ram? I could break it into
LIMIT/OFFSET blocks with slice notation, but I'd rather understand why
this is misbehaving so badly.

Kind Regards

Crispin



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to