On Nov 16, 9:40 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > I keep blowing up memory and crashing the server. Watching my memory > usage, it doesn't act like a leak. Instead of gradually rising, every > process will hover along at about 20 megs usage, then one will spike > to about 150, then settle down to about 100 and stay there. I've been > scouring my error logs and such trying to figure out where it's > happening, turned off all file uploading, upgraded mod_python and > generally hit all the obvious targets, but no luck so far. Anyway, my > latest theory is that it may be nesh thumbnails, which I'm using quite > a bit. I read on the thumbnails site that "All image sizes are cached > within private locmem:// instance to reduce filesystem access." and > I'm wondering if that could be it... could it be blowing out the > memory when it hits a page with several large images all being > thumbnailed?
If you are using Apache worker MPM, receiving bursts of concurrent requests against the same process, especially against URLs which perform some operation which uses a lot of transient memory for the operation and then releases it, can be an issue. To explain, consider prefork MPM instead. Here requests to a specific Apache child process would be serialised. Thus, if for certain URLs a lot of memory is temporarily used, for example if reportlab or PIL were being used to do something, then the memory would be released prior to the next request and so the next request could now use that freed up memory and overall RSS wouldn't need to increase much beyond the initial request. In worker MPM though, a specific Apache child process could receive concurrent requests. So, if that same resource is hit multiple times at the same time, for each handler it will have to allocate its own memory at the same time, thus, overall RSS could spike up. When the requests are finished, the memory is released, but it just goes back to the process free pool and RSS stays where it was. So, over time RSS will build up to what is required to handle you largest burst in traffic. A lot of the time though that memory may sit there unused. There are a few things one should do. First is that unless there is a good need for your whole site to be indexable by search engines, ensure you use a robots.txt file to block access to all but your primary site pages. These don't tend to come in bursts, but certainly will reduce chance of a lot of caching being triggered within your application through hitting all pages on your site. This issue is probably worse on things like Trac and its source code browser rather than Django applications. Second, if you have request handlers that do consume a lot of memory, consider using a thread mutex, or a semaphore system so as to limit how many concurrent requests can be hitting that handler. That way you can effectively ensure that requests are serialised and stop a transient burst in required memory. Third, if using Python embedded in Apache child processes, ie., mod_python or embedded mode of mod_wsgi, set MaxRequestsPerChild to a lowish value, more so if using worker MPM perhaps, so that processes are recycled in a regular basis and memory usage reduced back to base levels again. If one was using daemon mode of mod_wsgi, one can use the 'maximum-requests' option to WSGIDaemonProcess directive. Fourth, if one were using daemon mode of mod_wsgi (2.0) instead of mod_python, one could also use the 'inactivity-timeout' option to WSGIDaemonProcess such that if your site is idle for a while that it will recycle the process and again return memory usage back to base level. Using daemon mode of mod_wsgi can also provide other interesting ways of dealing with these sorts of problems. What one can easily do, without changing your Django application, is segment your Django application so that it is split across multiple distinct mod_wsgi daemon process groups. Thus, if you have specific sets of URLs which are a bit memory hungry, you can delegate them to run in specific daemon process groups. You can then set the 'maximum-requests' option for those daemon process groups to be a much more aggressive lower value so they are recycled more often and memory reclaimed quicker, without unnecessarily restarting your whole Django application. Similarly, one could set a quite low 'inactivity-timeout' for those daemon processes. As an example: WSGIDaemonProcess main processes=2 maximum-requests=1000 WSGIDaemonProcess bloat processes=1 maximum-requests=10 inactivity- timeout=30 WSGIScriptAlias / /some/path/django.wsgi WSGIProcessGroup main <Location /some/urls/which/use/lots/of/memory> WSGIProcessGroup bloat </Location> Main Django application runs across 2 multithreaded daemon processes. The memory bloating URLs are delegated to run in a single multithreaded process of their own which is recycled more often. One could even use single threaded processes to force serialisation of the requests for those URLs and avoid issues with bursts in memory use. WSGIDaemonProcess bloat processes=5 threads=1 \ maximum-requests=10 inactivity-timeout=30 So, using mod_wsgi and its more flexible means of distributing an application to run across multiple processes, including combinations of thread and non threaded processes, could also be a solution if your application is particularly afflicted with such issues. One other source of problems these days are bots which try and attack sites using specific URL patterns in an attempt to try and find a wiki like system whereby it can inject SPAM into comments. These attacks often occur in bursts and so can trigger these sorts of memory issues. Hope this is of help and/or interest. Graham --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-users@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-users?hl=en -~----------~----~----~----~------~----~------~--~---