On Sat, Apr 5, 2014 at 3:18 PM, Christopher Schultz < ch...@christopherschultz.net> wrote:
> Igor, > > On 4/4/14, 5:39 AM, Igor Cicimov wrote: > > > > On 04/04/2014 1:05 AM, "Christopher Schultz" > > <ch...@christopherschultz.net <mailto:ch...@christopherschultz.net>> > wrote: > >> > >> All, > >> > >> I'm having a problem in production I've never seem before. We are > >> running a pair of AWS EC2 m1.micro web servers where only one of them in > >> really in service at any given time. The httpd instance serves some > >> static content and forwards a great deal of traffic via stunnel to a > >> single back-end Tomcat server using mod_jk 1.2.37. We have been running > >> under this configuration for several years with no problems. > >> > > Enable the stunnel logs maybe they will reveal something? > > I don't think stunnel has changed much. Besides, the stunnel processes > aren't eating up the CPU: it's the httpd processes that are. > > >> Last weekend, we upgraded our OS to Amazon Linux 2014.03 (32-bit) from > >> Amazon's previous version (I can't remember which one), including the > >> package-refresh that comes with it for httpd. The current kernel version > >> is 3.10.34. The current httpd version is 2.2.26. The package name is > >> "httpd-2.2.26-1.1.amzn1.i686" if anyone is interested. We are using a > >> prefork MPM with the following (default) settings: > >> > >> StartServers 8 > >> MinSpareServers 5 > >> MaxSpareServers 20 > the range of spare servers here is < 10% of your max (256), and if the load changes by more than 10% httpd will have to kill a child or fork a new one > >> ServerLimit 256 > >> MaxClients 256 > >> MaxRequestsPerChild 4000 > 4000 is generally considered very low (needless termination/creation of child processes. See if you're getting frequent child process creation (server status report, staring at ps outputs). If so { Get rid of it (set MaxSpareServers higher, set MaxRequestsPerChild very high or disable it by setting it to 0). Perhaps look at pages 39 and following in https://blogs.oracle.com/trawick/resource/DeepDive/WebStackDeepDiveApache.pdf? (Note that there is a StartServers/MinSpareServers blunder in there (should be MaxSpareServers).) } > >> > >> What I can observe is that the CPU load average is rising from the usual > >> sub-2.0 value to sometimes as high as 70. That's seventy, not > >> seven-point-oh. > >> > >> I see no errors in the log, and httpd doesn't seem to be dropping any > >> requests... just running very very slowly. > >> > > What if you increase the LogLevel to debug? Meaybe jkLogLevel too. > > I could certainly do that, but the mod_jk binary is the same as before > the upgrade. > > >> It seems to come in waves: the load will go up, and everything will slow > >> down, and then we'll get a reprieve. > > > > Whats the memory usage at those times? If you have sysstat installed you > > can run sar for some stats about disk cpu and memory. > > I installed sysstat and am collecting data. Monday morning is when we'll > get killed, and I'll have the data then. > > > First thing i would do is move from prefork to mpm worker. You > > should see significant improvement. > > While that may be the case, something else must have changed. We had > been using the prefork MPM beforehand without a problem. > > We've upgraded to Linux kernel 3.10.35 at the suggestion of the AWS > support folks, but things still look pretty ugly. I've resurrected an > old snapshot to compare the performance relative to the upgraded > instance. If the 3.10.35 instance falls-over on Monday, I'll switch-over > to the older kernel instance. > can you run oprofile or something similar in that environment? > > Thanks, > -chris > > -- Born in Roswell... married an alien... http://emptyhammock.com/ http://edjective.org/