I've been struggling with a webserver which crashes every few days with the
dreaded "out of memory" problem. I've been trying to correlate it to spikes in
traffic, but that doesn't seem to be the case. I was quite excited last time
when the crash co-incided with a visit by several search engines, including
being hammered by the Baiduspider, but the most recent crash was in a
relatively quiet traffic period, so I'm once again scratching my head. I
thought, therefore that I'd consult the experts ...
The thing which I'm most confused about is that the server isn't particularly
busy. The server seems to die when we get a burst of over 400 page requests in
an hour, which doesn't seem like it should really be taxing a server of these
specs:
SPECS
==========
First the hardware spec. Its a real (as opposed to virtual) server with 2Gb of
RAM, and 4Gb swap. It's got an Intel(R) Pentium(R) Dual CPU E2160 @
1.80GHz, and a Western Digital 160Gb ATA disk.
As for software, here's a quick summary. Running on Centos 5.2:
- Apache/2.2.3 using prefork
- PHP 5.1.6 (cli)
- mysql Ver 14.12 Distrib 5.0.45
- Joomla 1.5.7 (latest version)
- Wordpress 2.x (latest version)
CONFIG SETTINGS
================
The relevant sections from my Apache config.
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 5
<IfModule prefork.c>
StartServers 2
MinSpareServers 2
MaxSpareServers 4
ServerLimit 256
MaxClients 256
MaxRequestsPerChild 4000
</IfModule>
In php.ini I have the following memory-related settings.
safe_mode = Off
max_execution_time = 30
max_input_time = 60
memory_limit = 100M
log_errors = on
report_memleaks = On
error_log = /var/log/php_error.log
post_max_size = 8M
DIAGNOSTICS
===============
I wrote a script to capture various bits of memory information about the server
and set it to go off once every 15 minutes. Of course once the server freezes
up, the frequency that these cron jobs run at drops, but last crash I managed
to get one just before and one just after the first Out Of Memory report. Let
me know if you're interested in a copy of the script.
There are no relevant error messages in the PHP error log or MySQL error log.
Before crash.
=============
=========================== SUMMARY ============================
Tue Dec 23 19:15:01 HKT 2008
=========================== uptime ==============================
19:15:01 up 3 days, 11:34, 0 users, load average: 5.11, 2.31, 1.00
========================== free -m ==============================
total used free shared buffers cached
Mem: 2001 879 1121 0 19 351
-/+ buffers/cache: 508 1493
Swap: 4094 0 4094
========================= vmstat 1.5 ============================
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
3 1 124 1148504 19920 360436 0 0 9 30 13 35 10 1 89 0 0
0 0 124 1224564 19924 360516 0 0 0 4 1029 249 8 1 91 1 0
0 0 124 1224572 19924 360516 0 0 0 0 1021 226 0 0 100 0
0
0 0 124 1224588 19924 360548 0 0 0 0 1017 229 0 0 100 0
0
1 0 124 1286712 19932 360540 0 0 0 236 1023 436 6 0 94 0 0
================== ps top 20 Processes by CPU ===================
USER %MEM %CPU PID CMD
apache 2.0 18.7 14356 /usr/sbin/httpd
apache 2.0 17.3 14357 /usr/sbin/httpd
apache 3.2 13.8 14340 /usr/sbin/httpd
apache 2.0 13.5 14325 /usr/sbin/httpd
apache 3.2 10.8 14330 /usr/sbin/httpd
mysql 2.3 1.1 2569 /usr/libexec/mysqld --basedir=/usr
--datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid
--skip-external-locking --socket=/var/lib/mysql/mysql.sock
root 0.5 0.0 12267 /usr/sbin/httpd
root 0.0 0.0 1580 [kjournald]
root 0.0 0.0 227 [kswapd0]
root 0.0 0.0 226 [pdflush]
root 0.0 0.0 2326 pcscd
ntp 0.2 0.0 2477 ntpd -u ntp:ntp -p /var/run/ntpd.pid -g
root 0.7 0.0 2787 /usr/bin/python -tt /usr/sbin/yum-updatesd
root 0.0 0.0 1 init [3]
68 0.1 0.0 2767 hald
root 0.0 0.0 2097 auditd
root 0.0 0.0 225 [pdflush]
root 0.0 0.0 422 [kjournald]
root 0.0 0.0 483 /sbin/udevd -d
After crash.
=============
=========================== SUMMARY ============================
Tue Dec 23 19:46:32 HKT 2008
=========================== uptime ==============================
19:46:34 up 3 days, 12:05, 0 users, load average: 111.60, 108.31, 85.79
========================== free -m ==============================
total used free shared buffers cached
Mem: 2001 1987 14 0 1 20
-/+ buffers/cache: 1964 36
Swap: 4094 4067 27
========================= vmstat 1.5 ============================
procs -----------memory---------- ---swap-- -----io---- --system--
-----cpu------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 113 4164604 9348 1264 22412 6 11 16 41 16 37 10 1 88 1
0
0 116 4164196 7876 1264 22712 1240 304 1492 304 1169 623 4 1 0 95
0
0 116 4165224 8464 1268 22708 320 1144 324 1148 1300 410 0 2 0 99
0
0 116 4172532 8252 1280 22820 980 7512 1192 7556 1232 442 0 7 0 93
0
0 116 4176940 11448 1296 22888 1388 4716 1456 4720 1178 452 1 5 0 95
0
================== ps top 20 Processes by CPU ===================
USER %MEM %CPU PID CMD
root 0.0 4.0 15290 ps -eo user,%mem,%cpu,pid,cmd --sort -%cpu
apache 0.6 1.2 14340 /usr/sbin/httpd
mysql 1.7 1.1 2569 /usr/libexec/mysqld --basedir=/usr
--datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid
--skip-external-locking --socket=/var/lib/mysql/mysql.sock
apache 0.5 0.9 15283 /usr/sbin/httpd
apache 0.6 0.9 14330 /usr/sbin/httpd
apache 0.5 0.8 14356 /usr/sbin/httpd
apache 0.9 0.6 14390 /usr/sbin/httpd
apache 0.5 0.3 14406 /usr/sbin/httpd
apache 0.5 0.2 14395 /usr/sbin/httpd
apache 0.6 0.2 14464 /usr/sbin/httpd
apache 0.6 0.2 14437 /usr/sbin/httpd
apache 0.6 0.2 14489 /usr/sbin/httpd
apache 0.6 0.2 14422 /usr/sbin/httpd
apache 0.6 0.2 14454 /usr/sbin/httpd
apache 0.5 0.2 14473 /usr/sbin/httpd
apache 0.6 0.2 14510 /usr/sbin/httpd
apache 0.5 0.2 15253 /usr/sbin/httpd
apache 0.5 0.2 14498 /usr/sbin/httpd
apache 0.5 0.2 14467 /usr/sbin/httpd
Discussion
================
OK, if you're still with me, thanks for getting this far. So before the Out of
Memory, the CPU load is around 70% and the load average is high, but not
critical. After the Out of Memory, the entre Swap is full, the load average is
insane, and the disk is swapping like crazy. There also seem to be a lot of
httpd processes spawned, but not really doing much. At this point the server is
inaccessible. Over the next hour or two the swap never really empties, and only
returns to normal after a reboot.
I've been trying a few things by changing the Apache settings according to
various suggestions around the internet, but I'm really stabbing in the dark
here. Is there anything anyone can see which is obviously wrong about my
configuration. I'm on the point of just slapping in some more RAM, and
forgetting about it, but I'd really like to understand it first.
I've also tried enabling the server-status page, but unless I'm watching it
when it goes south, it hasn't been much help.
So. Any advice?
JM
---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: [email protected]
" from the digest: [email protected]
For additional commands, e-mail: [email protected]