RW wrote:
On Wed, 1 Feb 2012 17:36:36 +0000
RW wrote:
spamd adjusts the number of children when it gets a message from one
of them reporting that it's idle. If you have all the busy children
locked-up, and have N idle children and then exactly N messages that
trigger the buggy regex come in at the same time, you lock-up all the
children and wont get any new idle events.
In your case (--min-spare=1 --max-spare=1) you have N=1 which turns a
rare scenario into a common one.
Sorry that's wrong. It should be when all of the busy children are
locked-up and you get a consecutive run of "bad" messages that
lockup-up all the remaining idle processes, you wont get any new idle
events.
The point's the same though if you have higher values of min-spare
and max-spare, it's less likely to happen.
I've adjusted one machine with min-spare and max-spare at 5, the other
min-spare 2 and max-spare 5; but I don't think that's it. (Although
even reducing the number of incidents will help...)
Under normal processing, a burst of mail will show:
child states: IBIBBIII
child states: BBIBBIII
child states: BBBBBIII
child states: BBBBBBII
child states: BBBBBBBI
child states: BBBBBBBB
child states: BBBBBBBBB
child states: BBBBBBBBBB
child states: BBBBBBBBBBB
...
etc, potentially up to max-children, within a few seconds.
During one of these lockups, it stalls whether or not there have been
free idle children (ie, potentially around that second or third line,
with 3 or 4 idle children).
None of that explains why the master spamd stops accepting new
connections, AFAICT.
-kgd