John Horne writes:
> On Wed, 2006-09-06 at 11:38 -0400, Theo Van Dinter wrote:
> 
> > My understanding (I haven't really looked at that code) is that "K" means 
> > the
> > child has been killed but it hasn't exited yet.  If a child is in that state
> > for more than, say, 5 seconds, there's likely an issue where it doesn't
> > actually die off, imo.
> > 
> > You should generally see states of I or B.
> > 
> I get the feeling that something is wrong here. I have restarted SA, and
> grepped the log file. It shows:
> 
> =======================================================================
> prefork: child states: BI
> prefork: child states: BB
> prefork: child states: BBB
> prefork: child states: BBBB
> prefork: child states: BBBBS
> prefork: child states: BBBBII
> prefork: child states: IBBBII
> prefork: child states: IIBBIK
> prefork: child states: IIIBKK
> prefork: child states: IIKIKK
> prefork: child states: IBKKKK
> prefork: child states: IIKKKK
> prefork: child states: BBKKKK
> prefork: child states: BBKKKKB
> prefork: child states: BBKKKKBB
> prefork: server reached --max-children setting, consider raising it
> prefork: child states: BIKKKKBB
> prefork: child states: IBKKKKBB
> prefork: child states: IBKKKKIB
> prefork: child states: IIKKKKIB
> prefork: child states: BIKKKKKI
> prefork: child states: IBKKKKKB
> prefork: child states: BBKKKKKI
> prefork: child states: BIKKKKKI
> prefork: child states: IIKKKKKI
> prefork: child states: IBKKKKKK
> prefork: child states: IIKKKKKK
> prefork: child states: BBKKKKKK
> prefork: server reached --max-children setting, consider raising it
> prefork: child states: BBKKKKKK
> prefork: server reached --max-children setting, consider raising it
> prefork: child states: IBKKKKKK
> prefork: child states: BIKKKKKK
> prefork: child states: IIKKKKKK
> =======================================================================
> 
> Some of the processes seem to almost immediately go in to the 'killed'
> state and stay there. 'ps auxww' shows that all 8 child processes are
> started. Running an strace (this is a Fedora Core 4 server) on some of
> the processes seems to show that they are waiting on select, and then
> get a 'resources unavailable' error. What resource I have no idea. E.g:
> 
> =======================================================================
> strace -Ff -p 12805
> Process 12805 attached - interrupt to quit
> select(16, [10], NULL, NULL, {290, 888000}) = 1 (in [10], left {147,
> 820000})
> read(10, "P....\n", 6)                  = 6
> read(10, 0xb4515f0, 6)                  = -1 EAGAIN (Resource
> temporarily unavailable)
> time(NULL)                              = 1157559274
> select(16, [10], NULL, NULL, {300, 0}
> =======================================================================
> 
> The process just sits there in this loop of some sort, and never seems
> to do any actual spam processing.
> 
> Any ideas about this?

That looks bad :(  The strace snippet, however, is pretty normal-looking.

First off, are you using an up-to-date 3.1.x release?

Secondly, you need to strace both the child *and* the parent spamd process
-- the easiest way to do this is to "strace -f" the parent spamd, then
kill -15 the kids so it starts new (traced) ones.

--j.

Reply via email to