One of our very important clients (a major bank), is having ongoing problems with denial-of-service style dictionary SPAM attacks. Their anti-spam/firewall teams are slow to respond to these outbreaks, so there may be periods of several hours where we will get frequent connection refused messages as their resources are overloaded.
As you know, the "cool-off" period in Postfix extends the retry delay of messages in the deferred queue from an initial time of $minimal_backoff_time to the maximum of $maximal_back_off time. So with the default configuration, my postfix server would try to deliver the message at 0/300/600/1200/2400, and eventually only trying every 4000 seconds (~66 min). The result is that some really unlucky messages end up being delayed by an hour or more. In the meantime, the client receives messages from other companies with only minor delays, so the client and my boss both blame our system. No matter how many times I explain it, they don't understand why the emails of others can get through, but mine have such big delays. For this one domain, I'd like to configure postfix to retry more frequently. My theory is that if we retry more frequently, we're more likely to be lucky enough to get a successful connection. In any case, my hope is that we'd at least be as successful as anyone else in getting messages through their overloaded system. I examined the section in the documentation regarding the High volume destination backlog (http://www.postfix.org/QSHAPE_README.html#backlog) and thought it was the answer to my problems. I followed the directions and added the fragile transport to master.cf, the domain to the transports file (postmap'd), and added the fragile_destination_concurrency_failed_cohort_limit = 360 fragile_destination_concurrency_limit = 20 To my main.cf file. (BTW, I'm running postfix-2.6.20080216p1). I adjusted my queue_run_delay = 60s and minimal_backoff_time = 60s. So in my testing, I was expecting to see messages sent to this domain retry every minute, for 360 times. At that point, I expected it would "cool off" to 4000 seconds. Unfortunately in my testing it didn't work, so either I'm misunderstanding how it's supposed to work, or I just have it misconfigured. Am I understanding these settings properly or should I be trying something else? I've toyed around with adjusting the maximal_backoff_time globally to 300s, but it bugs me to see retries on messages sent to yaho.com and hotmial.com. Thanks! Scott