On Mar 25, 3:39 am, David Christiansen <d...@dis.dk> wrote:
> Dear fellow Django users,
>
> We're beginning to be at wits end with a particular problem that we're
> having that I suspect has to do with server configuration issues, but
> might be something else.
>
> Our Django processes (whether it be Apache when running mod-python or
> independent daemons when runningmod_wsgiin daemon mode) begin to
> consume CPU shortly after the server is started.  I expect to see
> short spikes in CPU usage when requests occur, with some very small
> amount (single-digit %) when there are no requests occurring.
> Instead, we get a slowly increasing constant load on the CPU that
> eventually begins to affect the performance of the site.  Restarting
> the server makes the problem go away, but it returns after a few
> requests have been made.
>
> I've been unable to replicate this problem when running Django locally
> on my computer, with wsgi or with manage.py runserver.
>
> Because the CPU consumption comes from themod_wsgidaemon processes,
> I am quite sure that the problem lies somewhere in Python and not in
> an Apache module or in Postgres.
>
> As I can't replicate this problem locally, I've tried profiling the
> site on our actual server by running a second instance of Apache 
> andmod_wsgiwith the repoze.profile WSGI middleware.  This middleware
> appears to work just fine, but I've not yet been able to replicate the
> problem when running it.
>
> Additionally, repoze.profile starts recording profile data when a
> request begins, and stops when the request is done.  It strikes me as
> possible that it may not even record data on a long-running CPU hog if
> I can create it.
>
> I have two questions for you all:
>
>  1. Is there a good way to use cProfile or hotshot on the _entire_
> WSGI application rather than as a per-request middleware while running
> undermod_wsgi?  cProfile's restriction that it must be called with a
> string that is eval'd seems to make this impossible, but I'm quite new
> to WSGI configuration so I might be missing something obvious.
> Hotshot would seem to have the same problem, even though it wants a
> callable rather than a string.

It is not quite that simple. Under Apache/mod_wsgi the threads which
handle requests are not created by Python code. Instead, they are a
pool of threads created at the Apache level. So, there isn't anywhere
in Python code that you can hook in a profiler at the point of
creation of the threads.

That said, this shouldn't matter as outside of the bounds of the
request, those threads aren't doing anything, so profiling at point
that WSGI application entry point is called should be sufficient for
looking at the performance of actual requests.

If you therefore aren't seeing issues with individual requests, then
the issue may be background threads created by your application.

For starters, what I would do is create a handler which uses
'threading' module to get a list of all active threads in process and
dump out at least their names and object types. The Apache/mod_wsgi
external threads should be easily identified as they have type Dummy
from memory as they are effectively only a marker for thread given
they were created externally.

Anyway, this will tell you if you have background threads running and
you can work out where they were created.

If you have trouble working out what created the threads, adapt the
following WSGI script, turning it into a Django request handler and
dump out the stack trace of all running threads at that time in that
process.

import sys
import traceback

def stacktraces():
   code = []
   for threadId, stack in sys._current_frames().items():
       code.append("\n# ThreadID: %s" % threadId)
       for filename, lineno, name, line in
traceback.extract_stack(stack):
           code.append('File: "%s", line %d, in %s' % (filename,
                   lineno, name))
           if line:
               code.append("  %s" % (line.strip()))
   return code

def application(environ, start_response):
   status = '200 OK'
   output = '\n'.join(stacktraces())
   response_headers = [('Content-type', 'text/plain'),
                       ('Content-Length', str(len(output)))]
   start_response(status, response_headers)
   return [output]

Note though that this only works for threads executing in context of
Python. If there is a possibility of C threads running which don't
call into Python at all, then you can only pick that up with gdb. To
do that attach to process ID using gdb and issue 'thread apply all
bt'. This will give C stack trace for all threads. Just note that
mod_wsgi has its own couple of background threads in daemon mode.

You can find a mention of this gdb technique, albeit for tracking
stuck threads, at end of:

  http://code.google.com/p/modwsgi/wiki/DebuggingTechniques

Graham

>  2. Has anyone else here experienced anything similar who could shed
> some light on my situation?
>
> Many thanks in advance!
>
> /David Christiansen

-- 
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-us...@googlegroups.com.
To unsubscribe from this group, send email to 
django-users+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en.

Reply via email to