Victor Duchovni <[EMAIL PROTECTED]> wrote: > You can skip waiting for future occurences, the behaviour you describe > (especially on fallback relays where dead destinations are to be expected) > fits the known issue like a glove (and we are not at the OJ trial :-).
Regardless, I definitely sometimes get qmgr dying due to a watchdog timeout when it's deferring many thousands of messages to the same destination, without the deadlock. As a temporary workaround, I tried doubling daemon_timeout. However, I'm puzzled - it defaults to 18000s but the watchdog timer seems to kill qmgr during these incidents after about a half hour, which is 1800 seconds. Is the value of daemon_timeout actually representing tenths of seconds? Or is daemon_timeout not really the timer that controls how long the watchdog gives qmgr in these cases? > You may also consider tuning the feedback controls on the fallback relay, > so that problematic destinations are throttled less aggressively, this > is appropriate when most of the deliveries fail, but the site is not > dead and more than 0%, but less than 50%, of the deliveries succeed. Thank you. And yes, it's definitely the case with the domains that are involved, that some deliveries succeed, but fewer than 50% (at the times when this problem shows up). It is not possible to tune the feedback controls on a version earlier than 2.5, correct? -- Cos