On Mar 25, 3:39 am, David Christiansen <d...@dis.dk> wrote: > Dear fellow Django users, > > We're beginning to be at wits end with a particular problem that we're > having that I suspect has to do with server configuration issues, but > might be something else. > > Our Django processes (whether it be Apache when running mod-python or > independent daemons when runningmod_wsgiin daemon mode) begin to > consume CPU shortly after the server is started. I expect to see > short spikes in CPU usage when requests occur, with some very small > amount (single-digit %) when there are no requests occurring. > Instead, we get a slowly increasing constant load on the CPU that > eventually begins to affect the performance of the site. Restarting > the server makes the problem go away, but it returns after a few > requests have been made. > > I've been unable to replicate this problem when running Django locally > on my computer, with wsgi or with manage.py runserver. > > Because the CPU consumption comes from themod_wsgidaemon processes, > I am quite sure that the problem lies somewhere in Python and not in > an Apache module or in Postgres. > > As I can't replicate this problem locally, I've tried profiling the > site on our actual server by running a second instance of Apache > andmod_wsgiwith the repoze.profile WSGI middleware. This middleware > appears to work just fine, but I've not yet been able to replicate the > problem when running it. > > Additionally, repoze.profile starts recording profile data when a > request begins, and stops when the request is done. It strikes me as > possible that it may not even record data on a long-running CPU hog if > I can create it. > > I have two questions for you all: > > 1. Is there a good way to use cProfile or hotshot on the _entire_ > WSGI application rather than as a per-request middleware while running > undermod_wsgi? cProfile's restriction that it must be called with a > string that is eval'd seems to make this impossible, but I'm quite new > to WSGI configuration so I might be missing something obvious. > Hotshot would seem to have the same problem, even though it wants a > callable rather than a string.
It is not quite that simple. Under Apache/mod_wsgi the threads which handle requests are not created by Python code. Instead, they are a pool of threads created at the Apache level. So, there isn't anywhere in Python code that you can hook in a profiler at the point of creation of the threads. That said, this shouldn't matter as outside of the bounds of the request, those threads aren't doing anything, so profiling at point that WSGI application entry point is called should be sufficient for looking at the performance of actual requests. If you therefore aren't seeing issues with individual requests, then the issue may be background threads created by your application. For starters, what I would do is create a handler which uses 'threading' module to get a list of all active threads in process and dump out at least their names and object types. The Apache/mod_wsgi external threads should be easily identified as they have type Dummy from memory as they are effectively only a marker for thread given they were created externally. Anyway, this will tell you if you have background threads running and you can work out where they were created. If you have trouble working out what created the threads, adapt the following WSGI script, turning it into a Django request handler and dump out the stack trace of all running threads at that time in that process. import sys import traceback def stacktraces(): code = [] for threadId, stack in sys._current_frames().items(): code.append("\n# ThreadID: %s" % threadId) for filename, lineno, name, line in traceback.extract_stack(stack): code.append('File: "%s", line %d, in %s' % (filename, lineno, name)) if line: code.append(" %s" % (line.strip())) return code def application(environ, start_response): status = '200 OK' output = '\n'.join(stacktraces()) response_headers = [('Content-type', 'text/plain'), ('Content-Length', str(len(output)))] start_response(status, response_headers) return [output] Note though that this only works for threads executing in context of Python. If there is a possibility of C threads running which don't call into Python at all, then you can only pick that up with gdb. To do that attach to process ID using gdb and issue 'thread apply all bt'. This will give C stack trace for all threads. Just note that mod_wsgi has its own couple of background threads in daemon mode. You can find a mention of this gdb technique, albeit for tracking stuck threads, at end of: http://code.google.com/p/modwsgi/wiki/DebuggingTechniques Graham > 2. Has anyone else here experienced anything similar who could shed > some light on my situation? > > Many thanks in advance! > > /David Christiansen -- You received this message because you are subscribed to the Google Groups "Django users" group. To post to this group, send email to django-us...@googlegroups.com. To unsubscribe from this group, send email to django-users+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-users?hl=en.