On Mon, Feb 02, 2009 at 05:26:10PM +0100, Gaute Amundsen wrote: > On Monday 02 February 2009 15:43:19 Victor Duchovni wrote: > > On Mon, Feb 02, 2009 at 01:50:30PM +0100, Gaute Amundsen wrote: > > > Jan 25 05:59:19 hotell01 postfix/smtp[595]: fatal: watchdog timeout > > > Jan 25 05:59:20 hotell01 postfix/master[734]: warning: process > > > /usr/libexec/postfix/smtp pid 595 exit status 1 > > > Jan 25 05:59:20 hotell01 postfix/master[734]: warning: > > > /usr/libexec/postfix/smtp: bad command startup -- throttling > > > > This happens when the smtp(8) process has been stuck waiting for something > > to happen for 5 hours. What was happening around 00:59:xx on the same day? > > Apparently nothing in particular: > > http://pastebin.ca/1325397
Jan 25 00:56:53 hotell01 postfix/qmgr[738]: B75CA147967: from=<aaaa...@hotell01.pht.no>, size=29074, nrcpt=1 (queue active) The delivery agent scheduled to handle this message locked up for 5 hours and gave up. It got stuck before reporting "busy" to the master daemon, so no other smtp(8) processes were allocated. > our Munin http://munin.projects.linpro.no/ > has lost the fine details that far back but there is a regular high peak on > IOstsat just before 01:00 every night. Backup related I guess. > > both today and Jan 25 was a monday, so I had a look at cron.weekly which runs Perhaps your system runs out of resources during backup, and perhaps when this happens the system behaves in ways it should not. I am guessing a "ready" indication arrived for the private/smtp socket, but accept() blocked indefinitely. This would then be a kernel issue. If this happens again, you need to catch the stuck smtp(8) *before* the watchdog timer expires, and get a core file via "gcore". Then report a stack trace of the process. -- Viktor. Disclaimer: off-list followups get on-list replies or get ignored. Please do not ignore the "Reply-To" header. To unsubscribe from the postfix-users list, visit http://www.postfix.org/lists.html or click the link below: <mailto:majord...@postfix.org?body=unsubscribe%20postfix-users> If my response solves your problem, the best way to thank me is to not send an "it worked, thanks" follow-up. If you must respond, please put "It worked, thanks" in the "Subject" so I can delete these quickly.