RW wrote:
spamd adjusts the number of children when it gets a message from
one of them reporting that it's idle. If you have all the busy
children locked-up, and have N idle children and then exactly N
messages that trigger the buggy regex come in at the same time,
you lock-up all the children and wont get any new idle events.
In your case (--min-spare=1 --max-spare=1) you have N=1 which
turns a rare scenario into a common one.
Sorry that's wrong. It should be when all of the busy children are
locked-up and you get a consecutive run of "bad" messages that
lockup-up all the remaining idle processes, you wont get any new
idle events.
The point's the same though if you have higher values of min-spare
and max-spare, it's less likely to happen.
I was describing the special case of how spamd could run out of
children without hitting max-children.
Hm. I think I sort of get where you're going but I'm not sure I see the
whole picture. Can you lay out a timeline for what happens in this case?
I think the important question here is whether you see high CPU usage
when it locks-up with more "Bs" than you have cores. If you don't then
it's not a problem regex.
Mm, not for the entire period of the lockup. If there's a large message
with gobs of text to chew on, a child will peg a core for up to ~35s
when running normally, and I don't think I've seen any cores pegged for
an extended period when I've caught an incident live.
It depends what you mean by "accepting new connections". If you mean
the TCP handshake then I've no idea, but I think anything higher needs
idle children.
Mmyeah, that's part of what's bugging me. Our monitoring uses the
check_tcp plugin for Nagios like so:
$USER1$/check_tcp -H '$HOSTADDRESS$' -p '$ARG1$' -E -s 'PING
SPAMC/1.4\n' -e 'SPAMD/1.5 0 PONG'
and during these failures it returns a timeout. From testing by hand I
don't think it returns anything different for a TCP failure vs "timeout
waiting for command", but since I haven't been able to reproduce the
failure in any test environments I've been short on by-hand testing for
comparison.
-kgd