On Nov 16, 9:40 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> I keep blowing up memory and crashing the server. Watching my memory
> usage, it doesn't act like a leak. Instead of gradually rising, every
> process will hover along at about 20 megs usage, then one will spike
> to about 150, then settle down to about 100 and stay there. I've been
> scouring my error logs and such trying to figure out where it's
> happening, turned off all file uploading, upgraded mod_python and
> generally hit all the obvious targets, but no luck so far. Anyway, my
> latest theory is that it may be nesh thumbnails, which I'm using quite
> a bit. I read  on the thumbnails site that "All image sizes are cached
> within private locmem:// instance to reduce filesystem access." and
> I'm wondering if that could be it... could it be blowing out the
> memory when it hits a page with several large images all being
> thumbnailed?

If you are using Apache worker MPM, receiving bursts of concurrent
requests against the same process, especially against URLs which
perform some operation which uses a lot of transient memory for the
operation and then releases it, can be an issue.

To explain, consider prefork MPM instead. Here requests to a specific
Apache child process would be serialised. Thus, if for certain URLs a
lot of memory is temporarily used, for example if reportlab or PIL
were being used to do something, then the memory would be released
prior to the next request and so the next request could now use that
freed up memory and overall RSS wouldn't need to increase much beyond
the initial request.

In worker MPM though, a specific Apache child process could receive
concurrent requests. So, if that same resource is hit multiple times
at the same time, for each handler it will have to allocate its own
memory at the same time, thus, overall RSS could spike up. When the
requests are finished, the memory is released, but it just goes back
to the process free pool and RSS stays where it was.

So, over time RSS will build up to what is required to handle you
largest burst in traffic. A lot of the time though that memory may sit
there unused.

There are a few things one should do.

First is that unless there is a good need for your whole site to be
indexable by search engines, ensure you use a robots.txt file to block
access to all but your primary site pages. These don't tend to come in
bursts, but certainly will reduce chance of a lot of caching being
triggered within your application through hitting all pages on your
site. This issue is probably worse on things like Trac and its source
code browser rather than Django applications.

Second, if you have request handlers that do consume a lot of memory,
consider using a thread mutex, or a semaphore system so as to limit
how many concurrent requests can be hitting that handler. That way you
can effectively ensure that requests are serialised and stop a
transient burst in required memory.

Third, if using Python embedded in Apache child processes, ie.,
mod_python or embedded mode of mod_wsgi, set MaxRequestsPerChild to a
lowish value, more so if using worker MPM perhaps, so that processes
are recycled in a regular basis and memory usage reduced back to base
levels again. If one was using daemon mode of mod_wsgi, one can use
the 'maximum-requests' option to WSGIDaemonProcess directive.

Fourth, if one were using daemon mode of mod_wsgi (2.0) instead of
mod_python, one could also use the 'inactivity-timeout' option to
WSGIDaemonProcess such that if your site is idle for a while that it
will recycle the process and again return memory usage back to base
level.

Using daemon mode of mod_wsgi can also provide other interesting ways
of dealing with these sorts of problems. What one can easily do,
without changing your Django application, is segment your Django
application so that it is split across multiple distinct mod_wsgi
daemon process groups. Thus, if you have specific sets of URLs which
are a bit memory hungry, you can delegate them to run in specific
daemon process groups. You can then set the 'maximum-requests' option
for those daemon process groups to be a much more aggressive lower
value so they are recycled more often and memory reclaimed quicker,
without unnecessarily restarting your whole Django application.
Similarly, one could set a quite low 'inactivity-timeout' for those
daemon processes.

As an example:

  WSGIDaemonProcess main processes=2 maximum-requests=1000
  WSGIDaemonProcess bloat processes=1 maximum-requests=10 inactivity-
timeout=30

  WSGIScriptAlias / /some/path/django.wsgi
  WSGIProcessGroup main

  <Location /some/urls/which/use/lots/of/memory>
  WSGIProcessGroup bloat
  </Location>

Main Django application runs across 2 multithreaded daemon processes.
The memory bloating URLs are delegated to run in a single
multithreaded process of their own which is recycled more often. One
could even use single threaded processes to force serialisation of the
requests for those URLs and avoid issues with bursts in memory use.

  WSGIDaemonProcess bloat processes=5 threads=1 \
    maximum-requests=10 inactivity-timeout=30

So, using mod_wsgi and its more flexible means of distributing an
application to run across multiple processes, including combinations
of thread and non threaded processes, could also be a solution if your
application is particularly afflicted with such issues.

One other source of problems these days are bots which try and attack
sites using specific URL patterns in an attempt to try and find a wiki
like system whereby it can inject SPAM into comments. These attacks
often occur in bursts and so can trigger these sorts of memory issues.

Hope this is of help and/or interest.

Graham
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to