On Mon, Aug 22, 2011 at 8:10 PM, Jonathan Swartz <swa...@pobox.com> wrote:
> We use Apache/mod_perl 2 and occasionally get a child httpd process that > spins out of control, either consuming ever-increasing amounts of memory or > max cpu. Usually due to an infinite loop or other bug in a specific part of > the site - this sort of thing happens. > > I would like to monitor for such httpd children every second or so, and > when finding one, send it a USR2 signal so it can dump its current Perl > stack to our error logs. > A few ideas: - If your requests are typically short and the memory allocation uses enough CPU time, you could set a soft limit for CPU time then catch $SIG{XCPU} (you would also need to limit how many requests your child processes handle). It worked for me in a quick test. - If the memory usage is significant, as a quick check you could look at the total free memory available on the system, and only if it falls below a threshold do a more complex check with Proc::ProcessTable. - If the runaway process causes the load average to go up, you could look at the lod average, and only if it rises above a threshold do a more complex check with Proc::ProcessTable. - If your requests are typically short, you could create a small watchdog server; a request would register its PID with the watchdog server, then unregister when it finishes. If the watchdog sees a request register that does not complete within some time limit, it could send SIGUSR2. I have used a solution like this in the past, and it is effective, if a bit cumbersome. - Apache::Scoreboard<http://search.cpan.org/~mjh/Apache-Scoreboard-2.09.2/> can get you the PIDs of just the Apache processes, and some basic state information. You might be able to use this to make your process table scan more efficient. Maybe you could write a URL handler to do your checking and signaling using the scoreboard from within Apache, then load the URL periodically to trigger the test. Hope this is helpful, -----Scott.