On Tue, Oct 21, 2008 at 07:07:02PM -0400, Ofer Inbar wrote: > I have noticed occasional qmgr crashes with the "watchdog timer" error > occurring, usually when it's in the middle of deferring thousands of > messages for one domain all at once. I meant to investigate those. > > However, based on the logs, that's not what it was doing at the time > this particular freeze happened.
Your observations are almost certainly in error. Wietse's analysis is correct, and you should upgrade to 2.4 or later. > I'll watch for future occurrences to collect more data, > and try to get an upgrade soon. You can skip waiting for future occurences, the behaviour you describe (especially on fallback relays where dead destinations are to be expected) fits the known issue like a glove (and we are not at the OJ trial :-). You may also consider tuning the feedback controls on the fallback relay, so that problematic destinations are throttled less aggressively, this is appropriate when most of the deliveries fail, but the site is not dead and more than 0%, but less than 50%, of the deliveries succeed. The new feedback controls in 2.5.5 allow you to tune Postfix to be less pessimistic when sending bulk mail to highly problematic destinations. -- Viktor. Disclaimer: off-list followups get on-list replies or get ignored. Please do not ignore the "Reply-To" header. To unsubscribe from the postfix-users list, visit http://www.postfix.org/lists.html or click the link below: <mailto:[EMAIL PROTECTED]> If my response solves your problem, the best way to thank me is to not send an "it worked, thanks" follow-up. If you must respond, please put "It worked, thanks" in the "Subject" so I can delete these quickly.