I’m banging my head on the desk over this one. Some hosts, fly right on through - connect, ehlo, mail from, rcpt-to, data, quit .. done. All in maybe 1 second. Others, will consistently timeout after ehlo. I’ve telnetted into the box from off-site (regular host, not a mail server), and it immediately connects, immediately responds with the 220 mail.zzzz.com ESMTP Postfix herald, and immediately responds to the helo or ehlo. However, it will take about two minutes to respond to mail-from. Once it eventually responds with 250 2.1.0 Ok, everything else goes as expected. Is there some other check or config value I’m missing? I’m losing a ton of legitimate email because some servers simply refuse to wait the two minutes, and it’s causing my smtpd connection counts to artificially rise as well, so I’m definitely eager to figure this one out as quickly as possible.
The box is somewhat active but certainly not loaded (load avg. 0.30 to 0.70, and ~50 or so established connections to smtpd, most of those waiting for the response to mail from), and I’ve turned off all rbl’s, protocol checks (reject_invalid_helo_hostname, reject_non_fqdn_helo_hostname, reject_non_fqdn_sender) as well as turned off DNS checks. tcpdump shows nothing odd or out of the ordinary, when run from either end of the wire. Any help or ideas would be GREATLY appreciated!