Hello,

  Well, after a late night and having joined the ranks of those
running dbmail on a production, can't turn back now basis, we're
seeing dbmail-pop3d get into a state of using all available cpu
cycles, as others have mentioned.  We have one machine on which
this is 100% reproducable by simply telnetting to port 110 and
waiting 5 minutes for the timeout - the dbmail-pop3d that handled
our connection will then be in that state.  Sending HUP signal to
the parent dbmail-pop3d will end up clearing it (at the cost of
dropping all current pop3 sessions), or sig KILL to that process
will kill it to.  It's easy to identify by turning up logging
and watching for 'got signal [14]' .. we're working on writing a
script to monitor that and KILL those processes for short term.
The problem is somewhere in the signal handling, but I've not
found it yet.  If anyone wants to look into it, or try reproducing
it, that'd be great.  I'll keep working on it too, but right now
lack of sleep is starting to impair progress.  This is with cvs
as of this morning, with pgsql; we have a few custom patches too,
but they only take place later on (after you log in), this is just
in the signal handling routines.

Thanks,
Jesse



--
Jesse Norell
jesse (at) kci.net


Reply via email to