I've seen this two or three times now, and I'm not sure what to make of it.
Outward appearance is that we get hit with a ton of spam, or perhaps that an RBL goes out. I wind up with many copies of spamd running, many more than the -m parameter should allow. (And forget about procmail and sendmail. Hundreds of copies.) They never seem to go away, or at best, go away very, very slowly.
Yep, seen that before we upgraded our server recently. Basically when a flood of junk comes in, your server is trying to launch more copies of sendmail/procmail/spamd than it can handle, it runs out of memory, starts swapping, and gets bogged down. The more it swaps, the more it falls behind the incomming connection rate, the more processes that try to run at once, so the situation compounds itself. If you're unlucky it will not recover by itself.
The monitors show that disk activity is through the roof.
Yep, and if you use vmstat you should see that it is swapping activity (si and so columns) that are the primary cause of the disk activity.
They all but cripple the machine. Then I go in and kill sendmail, kill all the procmails and spamds, and restart things, and everything clear up very quickly.
Of course, because when you kill all those processes you free up physical memory and the machine is no longer in a state of swap thrashing, so it becomes responsive again almost immediately.
So quickly that it makes me suspicious as to what was REALLY wrong. Maybe something more than just load, or RBL problems. Maybe a locking problem.
Nope, this is a problem with stock installs of sendmail/procmail and/or a lack of memory. How much memory does the server have ? Roughly how many messages a day do you process ?
In your sendmail.cf you'll want to tune:
QueueLA RefuseLA MaxDaemonChildren ConnectionRateThrottle
of course what you tune them to depends on the specs of your server.
Using the -m option of spamd IMHO is a bad idea and compounds the problem, because it causes procmail and sendmail processes to *WAIT* when there is too much incomming stuff to handle at once, and all these processes waiting around consume memory and eventually push you into swapping. What you want them to do is give up and go away, and retry later. I do this with a local delivery wrapper that also does virus scanning, but here is a stripped down version of my wrapper script:
(sorry if this is off topic for the list, but I see enough people with this problem that it might help them)
---------------
#!/bin/sh
loadavg=`sed 's/\..*//g' /proc/loadavg`
if [ "$loadavg" -gt 14 ]; then
/usr/bin/logger -i -p mail.warn -t `basename $0` WARNING: Returning temporary failure due to load average of $loadavg
exit 75
fi
procmail=`ps -Af | grep procmail | grep -v grep | wc -l`
if [ "$procmail" -gt 29 ]; then
/usr/bin/logger -i -p mail.warn -t `basename $0` WARNING: Returning temporary failure due to $procmail procmail processes running - load average $loadavg
exit 75
fi
/usr/bin/procmail "$@"
---------------
(the logger lines are single lines, the email will probably wrap them hideously)
Basically if it sees a load average greater than X (in this case 15 or more) or more than a certain number of simultaneous procmail's running (in this case 30 or more) it returns temporary failure, sendmail puts the message back on the queue, and you don't have a copy of procmail and sendmail waiting around. Next time a queue runner comes along it re-tries the message. (So you probably want your queue running interval fairly short)
You'd have to edit the local mailer definition in your sendmail.cf to point at this script, so its not for the faint of heart ;-)
I'm sure some procmail guru could rewrite this as a procmail script which would make it easier to implement. (Any takers ?)
Regards, Simon
------------------------------------------------------- This SF.Net email is sponsored by: INetU Attention Web Developers & Consultants: Become An INetU Hosting Partner. Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission! INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk