On Wed, Oct 22, 2008 at 12:06:40AM -0400, Ofer Inbar wrote:

> Victor Duchovni <[EMAIL PROTECTED]> wrote:
> > You can skip waiting for future occurences, the behaviour you describe
> > (especially on fallback relays where dead destinations are to be expected)
> > fits the known issue like a glove (and we are not at the OJ trial :-).
> 
> Regardless, I definitely sometimes get qmgr dying due to a watchdog
> timeout when it's deferring many thousands of messages to the same
> destination, without the deadlock.

Yes, the deadlock is infrequent and time dependent, the watchdog has
to fire right when qmgr(8) is already performing I/O ops inside syslog(3).
Most of the time it fires when qmgr(8) is doing something else.

> As a temporary workaround, I tried doubling daemon_timeout.
> 
> However, I'm puzzled - it defaults to 18000s but the watchdog timer
> seems to kill qmgr during these incidents after about a half hour,
> which is 1800 seconds.

Wrong timer. The watchdog timeout is hard-coded to 1000s.

> > You may also consider tuning the feedback controls on the fallback relay,
> > so that problematic destinations are throttled less aggressively, this
> > is appropriate when most of the deliveries fail, but the site  is not
> > dead and more than 0%, but less than 50%, of the deliveries succeed.
> 
> Thank you.  And yes, it's definitely the case with the domains that
> are involved, that some deliveries succeed, but fewer than 50% (at the
> times when this problem shows up).
> 
> It is not possible to tune the feedback controls on a version earlier
> than 2.5, correct?

No. This is a major queue manager design change in 2.5, and is not
available in earlier code. The watchdog issue is resolved in 2.4.

Note, the definition of "succeed" here is the opposite of a "failure",
where "failure" is not failure to deliver, but rather failure to connect,
active rejection at connect or HELO or an I/O timeout during the mail
transaction. Deliveries that fail with 4XX in response to "MAIL", "RCPT",
"DATA" or "." don't cause negative feedback...

-- 
        Viktor.

Disclaimer: off-list followups get on-list replies or get ignored.
Please do not ignore the "Reply-To" header.

To unsubscribe from the postfix-users list, visit
http://www.postfix.org/lists.html or click the link below:
<mailto:[EMAIL PROTECTED]>

If my response solves your problem, the best way to thank me is to not
send an "it worked, thanks" follow-up. If you must respond, please put
"It worked, thanks" in the "Subject" so I can delete these quickly.

Reply via email to