One of our very important clients (a major bank), is having ongoing
problems with denial-of-service style dictionary SPAM attacks.   Their
anti-spam/firewall teams are slow to respond to these outbreaks, so
there may be periods of several hours where we will get frequent
connection refused messages as their resources are overloaded.

As you know, the "cool-off" period in Postfix extends the retry delay of
messages in the deferred queue from an initial time of
$minimal_backoff_time to the maximum of $maximal_back_off time.  So with
the default configuration, my postfix server would try to deliver the
message at 0/300/600/1200/2400, and eventually only trying every 4000
seconds (~66 min). 

The result is that some really unlucky messages end up being delayed by
an hour or more.  In the meantime, the client receives messages from
other companies with only minor delays, so the client and my boss both
blame our system.  No matter how many times I explain it, they don't
understand why the emails of others can get through, but mine have such
big delays.

For this one domain, I'd like to configure postfix to retry more
frequently.  My theory is that if we retry more frequently, we're more
likely to be lucky enough to get a successful connection.  In any case,
my hope is that we'd at least be as successful as anyone else in getting
messages through their overloaded system.

I examined the section in the documentation regarding the High volume
destination backlog (http://www.postfix.org/QSHAPE_README.html#backlog)
and thought it was the answer to my problems.  I followed the directions
and added the fragile transport to master.cf, the domain to the
transports file (postmap'd), and added the 

fragile_destination_concurrency_failed_cohort_limit = 360
fragile_destination_concurrency_limit = 20

To my main.cf file.  (BTW, I'm running postfix-2.6.20080216p1).

I adjusted my queue_run_delay = 60s and minimal_backoff_time = 60s.  So
in my testing, I was expecting to see messages sent to this domain retry
every minute, for 360 times.  At that point, I expected it would "cool
off" to 4000 seconds.  Unfortunately in my testing it didn't work, so
either I'm misunderstanding how it's supposed to work, or I just have it
misconfigured.

Am I understanding these settings properly or should I be trying
something else?

I've toyed around with adjusting the maximal_backoff_time globally to
300s, but it bugs me to see retries on messages sent to yaho.com and
hotmial.com.

Thanks!

Scott

Reply via email to