Hi,

I have been using an older version of postfix on a relay server for
quite a few years now, without any real incident. It accepts mail from
one or two other servers and forwards it on to an internal Exchange
server on the same network. It handles about 250k messages per day.
It's configured with dual instances.

It seems for the last few months there is an increasing delay in
delivery times and I can't explain why. I suspect something on the
Exchange side because nothing has changed on the postfix server. The
administrators of the Exchange box aren't able to provide any ideas
either. I'm also pretty sure it's not a network issue. After passing
billions of packets there isn't a single error. I'm also pretty sure
DNS is configured properly.

I'm seeing occasions where there will be a constant 50 messages in the
second instance, and as many as 500 at times. The 500 messages may sit
there for a half-hour, and then all of the sudden they are delivered.
However, there remains a constant 50 in the queue with status info
like "conversation timed out while sending end of data -- message may
be sent more than once" or "Error: timeout exceeded (in reply to end
of DATA command)".

The messages may sit in the queue for even a few weeks, and I assume
are eventually delivered.

In my mail log, I see info like the following:

Aug 20 01:08:12 bocmailrelay POSTFIX_F/smtp[1186]: C638B1A8008: to=<marie
[email protected]>, relay=mail.example.com[xxx.yyy.zzz.3], delay=625109, st
atus=deferred (conversation with mail.example.com[xxx.yyy.zzz.3] timed out
while sending end of data -- message may be sent more than once)

I'm having difficulty discerning messages entering the second queue
(with delay=0, typically) and messages being
queued because they couldn't immediately be delivered. Is there an
easier way to establish which messages are
being queued because they couldn't easily be delivered?

I thought I would try "debug_peer_list" and increase logging to try
and get information on delays from a specific domain, but I'm not sure
that is what this variable is used for. Is there another way to
increase logging either for a specific domain or for this problem to
better troubleshoot it?

Thanks,
Alex Hayes

Reply via email to