On Wed, 01 Feb 2012 16:56:34 -0500
Kris Deugau wrote:

> RW wrote:
> > On Wed, 1 Feb 2012 17:36:36 +0000
> > RW wrote:
> >
> >> spamd adjusts the number of children when it gets  a message from
> >> one of them reporting that it's idle. If you have all the busy
> >> children locked-up, and have N idle children and then exactly N
> >> messages that trigger the buggy regex come in at the same time,
> >> you lock-up all the children and wont get any new idle events.
> >>
> >> In your case (--min-spare=1 --max-spare=1) you have N=1 which
> >> turns a rare scenario into a common one.
> >
> > Sorry that's wrong. It should be when all of the busy children are
> > locked-up and you get a consecutive run of "bad" messages that
> > lockup-up all the remaining idle processes, you wont get any new
> > idle events.
> >
> > The point's the same though if you have higher values of min-spare
> > and max-spare, it's less likely to happen.
> 
> I've adjusted one machine with min-spare and max-spare at 5, the
> other min-spare 2 and max-spare 5;  but I don't think that's it.
> (Although even reducing the number of incidents will help...)
> 
> Under normal processing, a burst of mail will show:
> 
> child states: IBIBBIII
> child states: BBIBBIII
> child states: BBBBBIII
> child states: BBBBBBII
> child states: BBBBBBBI
> child states: BBBBBBBB
> child states: BBBBBBBBB
> child states: BBBBBBBBBB
> child states: BBBBBBBBBBB
> ...
> 
> etc, potentially up to max-children, within a few seconds.


I was describing the special case of how spamd could run out of
children without hitting  max-children. It could also run-out the
normal way. And when there are clients waiting and a child becomes
idle you do get a chain-reaction that adds children very rapidly. 

> During one of these lockups, it stalls whether or not there have been 
> free idle children (ie, potentially around that second or third line, 
> with 3 or 4 idle children).

That's to be expected since the "child states" logging is generated by
the same function that modifies the number of children.  

I think the important question here is whether you see high CPU usage
when it locks-up with more  "Bs" than you have cores. If you don't then
it's not a problem regex.

> None of that explains why the master spamd stops accepting new 
> connections, AFAICT.

It depends what you mean by "accepting new connections". If you mean
the TCP handshake then I've no idea, but I think anything higher needs
idle children.

Reply via email to