On 15 déc, 02:44, Tracy Reed <tr...@ultraviolet.org> wrote: > I have code which looks basically like this: > > now = datetime.today() > beginning = datetime.fromtimestamp(0) > end = now - timedelta(days=settings.DAYSTOKEEP) > > def purgedb(): > """Delete archivedEmail objects from the beginning of time until > daystokeep days in the past.""" > queryset = archivedEmail.objects.all() > purgeset = queryset.filter(received__range=(beginning, end)) > for email in purgeset: > print email > try: > os.unlink(settings.REAVER_CACHE+"texts/%s" % email.cacheID) > os.unlink(settings.REAVER_CACHE+"prob_good/%s" % email.cacheID) > os.unlink(settings.REAVER_CACHE+"prob_spam/%s" % email.cacheID) > except OSError: > pass > purgeset.delete() > > if __name__ == '__main__': > purgedb() > (snip)
> But when purgedb runs it deletes emails 100 at a time (which takes > forever) and after running for a couple of hours uses a gig and a half > of RAM. If I let it continue after a number of hours it runs the > machine out of RAM/swap. looks like settings.DEBUG=True to me. > Am I doing something which is not idiomatic or misusing the ORM > somehow? My understanding is that it should be lazy so using > objects.all() on queryset and then narrowing it down with a > queryset.filter() to make a purgeset should be ok, right? No problem here as long as you don't do anything that forces evaluation of the queryset. But this is still redundant - you can as well build the appropriate queryset immediatly. > What can I > do to make this run in reasonable time/memory? Others already commented on checking whether you have settings.DEBUG set to True - the usual suspects when it comes to RAM issues with django's ORM. wrt/ the other mentioned problem - building whole model instances for each row - you can obviously save a lot of work here by using a value_list queryset - tuples are very cheap. Oh, and yes: I/O and filesystem operations are not free neither. This doesn't solve your pb with the script eating all the RAM, but surely impacts the overall performances. Now for something different - here are a couple other python optimisation tricks: > for email in purgeset: > print email Remove this. I/O are not for free. Really. > try: > os.unlink(settings.REAVER_CACHE+"texts/%s" % email.cacheID) > os.unlink(settings.REAVER_CACHE+"prob_good/%s" % email.cacheID) > os.unlink(settings.REAVER_CACHE+"prob_spam/%s" % email.cacheID) > except OSError: > pass Move all redundant attribute lookup (os.unlink and settings.REAVER_CACHE) and string concatenations out of this loop. def purgedb(): """Delete archivedEmail objects from the beginning of time until daystokeep days in the past. """ text_cache = settings.REAVER_CACHE + "texts/%s" prob_good_cache = settings.REAVER_CACHE+"prob_good/%s" prob_spam_cache = settings.REAVER_CACHE+"prob_spam/%s" unlink = os.unlink # no reason to put this outside the function. now = datetime.today() beginning = datetime.fromtimestamp(0) end = now - timedelta(days=settings.DAYSTOKEEP) qs = archivedEmail.objects.filter(received__range=(beginning, end)) for row in qs.value_list(cacheID): cacheID = row[0] try: unlink(text_cache % cacheID) unlink(prob_good_cache % cacheID) unlink(prob_spam_cache % cacheID) except OSError: pass qs.delete() Oh and yes, one last point : how do you run this script exactly ? HTH -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.