Hi, I have been using an older version of postfix on a relay server for quite a few years now, without any real incident. It accepts mail from one or two other servers and forwards it on to an internal Exchange server on the same network. It handles about 250k messages per day. It's configured with dual instances.
It seems for the last few months there is an increasing delay in delivery times and I can't explain why. I suspect something on the Exchange side because nothing has changed on the postfix server. The administrators of the Exchange box aren't able to provide any ideas either. I'm also pretty sure it's not a network issue. After passing billions of packets there isn't a single error. I'm also pretty sure DNS is configured properly. I'm seeing occasions where there will be a constant 50 messages in the second instance, and as many as 500 at times. The 500 messages may sit there for a half-hour, and then all of the sudden they are delivered. However, there remains a constant 50 in the queue with status info like "conversation timed out while sending end of data -- message may be sent more than once" or "Error: timeout exceeded (in reply to end of DATA command)". The messages may sit in the queue for even a few weeks, and I assume are eventually delivered. In my mail log, I see info like the following: Aug 20 01:08:12 bocmailrelay POSTFIX_F/smtp[1186]: C638B1A8008: to=<marie [email protected]>, relay=mail.example.com[xxx.yyy.zzz.3], delay=625109, st atus=deferred (conversation with mail.example.com[xxx.yyy.zzz.3] timed out while sending end of data -- message may be sent more than once) I'm having difficulty discerning messages entering the second queue (with delay=0, typically) and messages being queued because they couldn't immediately be delivered. Is there an easier way to establish which messages are being queued because they couldn't easily be delivered? I thought I would try "debug_peer_list" and increase logging to try and get information on delays from a specific domain, but I'm not sure that is what this variable is used for. Is there another way to increase logging either for a specific domain or for this problem to better troubleshoot it? Thanks, Alex Hayes
