In the 25 years I've been dealing with email, the one constant is "what
happened to my email?". Not that I'm a guru --- I've always been in
small groups doing a variety of things from admin to networking to
coding, so often it's a matter of something that hasn't come up in a
while and having to refresh my memory...

One of the reasons I like postfix is that it's relatively easy to answer
that question...but not perfect. I haven't had a chance to try the
latest 2.8 yet --- we're using Zimbra at the moment which is using
2.6.5, but I just had a case where the message bounced with this
singularly unhelpful message:

> If you do so, please include this problem report. You can delete your own
> text from the attached returned message.
> 
>                    The mail system
> 
> <someotheru...@someotherdomain.com>: someotherdomain.com 

with the relevant "details":

> Final-Recipient: rfc822; someotheru...@someotherdomain.com
> Original-Recipient: rfc822;someotheru...@someotherdomain.com
> Action: failed
> Status: 5.0.0
> Diagnostic-Code: X-Postfix; someotherdomain.com

The corresponding log entries were no more helpful:

> Feb  1 13:26:47 mta01 postfix/smtpd[25137]: warning: 1.1.1.1: address not 
> listed for hostname somehost.domain.com
> Feb  1 13:26:47 mta01 postfix/smtpd[25137]: connect from unknown[1.1.1.1]
> Feb  1 13:26:51 mta01 postfix/smtpd[25137]: 3BBBC49137C: 
> client=unknown[1.1.1.1], sasl_method=LOGIN, sasl_username=someu...@domain.com
> Feb  1 13:26:51 mta01 postfix/cleanup[24210]: 3BBBC49137C: 
> message-id=<00de01cbc256$bcbcd8e0$36368aa0$@com>
> Feb  1 13:26:51 mta01 postfix/qmgr[20498]: 3BBBC49137C: 
> from=<someu...@domain.com>, size=3089, nrcpt=1 (queue active)
> Feb  1 13:26:51 mta01 postfix/smtpd[25137]: disconnect from unknown[1.1.1.1]
> Feb  1 13:26:51 mta01 postfix/error[25284]: 3BBBC49137C: 
> to=<someotheru...@anotherdomain.com>, relay=none, delay=0.05, 
> delays=0.03/0.01/0/0.01, dsn=5.0.0, status=bounced (someotherdomain.com)
> Feb  1 13:26:51 mta01 postfix/bounce[25286]: 3BBBC49137C: sender non-delivery 
> notification: 46E7C49183B
> Feb  1 13:26:51 mta01 postfix/qmgr[20498]: 3BBBC49137C: removed

It turns out the problem was that "someotherdomain.com" was configured
locally when it shouldn't have been.  I *think* that's what the domain
name in the Diagnostic-Code in the bounce and the
"(someotherdomain.com)" after "status=bounced" was trying to say, but
with some slight changes to the log messages, troubleshooting would be
vastly improved:

First, *every* log entry should have an ID attached (if it's not related
to a connection, it should have something like "INTERNAL" or ideally
something related to the activity that caused the log entry that, if
appropriate, allows related log entries to be searched for easily).  A
connection id should be associated with the first connection from a
client.  Each message should get a unique suffix, but everything related
to that connection should be tied together.  That ID should show in the
bounce as well.  That way, I can "grep ID maillog" and get everything
related to that message quickly and easily.  Note that the suffix should
be easily parsable from the connection id so a bounce could be processed
automatically if desired; it even makes it easier for the human to do it.

Second, all errors should be clear; here's what the above *should* have
looked like, in my ideal world:

> Final-Recipient: rfc822; someotheru...@someotherdomain.com
> Original-Recipient: rfc822;someotheru...@someotherdomain.com
> Action: failed -
> Status: 5.0.0
> Diagnostic-Code: X-Postfix; user unknown in local domain
someotherdomain.com
> Message-ID: 3BBBC49137C-001

"grep 3BBBC49137C maillog":

> Feb  1 13:26:47 mta01 postfix/smtpd[25137]: 3BBBC49137C: connect from 
> unknown[1.1.1.1]
> Feb  1 13:26:47 mta01 postfix/smtpd[25137]: 3BBBC49137C: warning: 1.1.1.1: 
> address not listed for hostname somehost.domain.com
> Feb  1 13:26:47 mta01 postfix/smtpd[25137]: 3BBBC49137C: 
> client=unknown[1.1.1.1], auth_method=LOGIN, auth_username=someu...@domain.com
> Feb  1 13:26:51 mta01 postfix/cleanup[24210]: 3BBBC49137C-001: 
> message-id=<00de01cbc256$bcbcd8e0$36368aa0$@com>
> Feb  1 13:26:51 mta01 postfix/qmgr[20498]: 3BBBC49137C-001: 
> from=<someu...@domain.com>, size=3089, nrcpt=1 (entered into active queue)
> Feb  1 13:26:51 mta01 postfix/smtpd[25137]: 3BBBC49137C: disconnect from 
> unknown[1.1.1.1]
> Feb  1 13:26:51 mta01 postfix/error[25284]: 3BBBC49137C-001: 
> to=<someotheru...@anotherdomain.com>, relay=none, delay=0.05, 
> delays=0.03/0.01/0/0.01, dsn=5.0.0, status=bounced (someotherdomain.com), 
> reason=user unknown in local domain someotherdomain.com
> Feb  1 13:26:51 mta01 postfix/bounce[25286]: 3BBBC49137C-001: sender 
> non-delivery notification: 46E7C49183B-001
> Feb  1 13:26:51 mta01 postfix/qmgr[20498]: 3BBBC49137C-001: removed

Like I said, postfix usually does pretty well --- in the case where a
remote host rejected the message, it's perfect:

> status=deferred (host xyzzy.com[3.3.3.3] said: 452 4.2.2 Over quota (in reply 
> to end of DATA command))

I think these changes are actually pretty small ones, but would be
pretty helpful; that's my 2c, fwiw...

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to