On Mon, Feb 02, 2009 at 05:26:10PM +0100, Gaute Amundsen wrote:

> On Monday 02 February 2009 15:43:19 Victor Duchovni wrote:
> > On Mon, Feb 02, 2009 at 01:50:30PM +0100, Gaute Amundsen wrote:
> > > Jan 25 05:59:19 hotell01 postfix/smtp[595]: fatal: watchdog timeout
> > > Jan 25 05:59:20 hotell01 postfix/master[734]: warning: process
> > > /usr/libexec/postfix/smtp pid 595 exit status 1
> > > Jan 25 05:59:20 hotell01 postfix/master[734]: warning:
> > > /usr/libexec/postfix/smtp: bad command startup -- throttling
> >
> > This happens when the smtp(8) process has been stuck waiting for something
> > to happen for 5 hours. What was happening around 00:59:xx on the same day?
> 
> Apparently nothing in particular:
>  
> http://pastebin.ca/1325397

    Jan 25 00:56:53 hotell01 postfix/qmgr[738]: B75CA147967:
        from=<aaaa...@hotell01.pht.no>, size=29074, nrcpt=1 (queue active)

The delivery agent scheduled to handle this message locked up for 5
hours and gave up. It got stuck before reporting "busy" to the master
daemon, so no other smtp(8) processes were allocated.

> our Munin http://munin.projects.linpro.no/
> has lost the fine details that far back but there is a regular high peak on 
> IOstsat just before 01:00 every night. Backup related I guess.
> 
> both today and  Jan 25 was a monday, so I had a look at cron.weekly which runs

Perhaps your system runs out of resources during backup, and perhaps when
this happens the system behaves in ways it should not.

I am guessing a "ready" indication arrived for the private/smtp socket,
but accept() blocked indefinitely. This would then be a kernel issue.

If this happens again, you need to catch the stuck smtp(8) *before* the
watchdog timer expires, and get a core file via "gcore". Then report a
stack trace of the process.

-- 
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:majord...@postfix.org?body=unsubscribe%20postfix-users>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.

Reply via email to