On Tue, Jun 26, 2018 at 02:34:19PM -0400, James B. Byrne wrote:
> When we do we frequently (always?) get messages like this in
> the mail queue:

Well, I may not have solved the problem, but I have identified what it
is and provided a work-around for now.

The issue is this:

smtp_tls_security_level=dane

Last week Viktor Dukhovni reported to me that our domain had a problem
with our DNSSEC.  I investigated that and discovered on our master DNS
host the named daemon was reporting errors similar to the following.

Jun  9 00:37:40 dns03 named[3729]: malformed transaction: . . .

These errors did not have any obvious cause and inquires on the OS
users list did not elicit any diagnosis.  So for want of any clear
remediation I simply restarted the named service and the problem with
DNSSEC went away.

However, the problem returned. And, in combination with the dane
security level setting in Postfix, apparently caused Postfix to
report:

(delivery temporarily suspended: Server certificate not verified)

Switching smtp_tls_security_level from 'dane' to 'may' allowed the
mail to be delivered without further problem.

[root@mx32 ~]# mailq
Mail queue is empty

This still does not clarify for me why the double-bounce address was
being reported given that the postconf reported values for notify did
not include bounces.

Now, the problem with DANE and our DNS master service turns out to be
rather odd.  Somehow we ended up with two named processes running
simultaneously on that host. How one can have two processes bind to
the same port escapes me but evidently it happened.  Or it did not
happen and the June 8 process somehow failed to terminate in
consequence.

[root@dns03 ~ (master #)]# service named restart
Stopping named:                                            [  OK  ]
Starting named:                                            [  OK  ]

[root@dns03 ~ (master #)]# ps -ef | grep named
named     3729     1  0 Jun08 ?        00:00:59 /usr/sbin/named -u named
named    24749     1  0 10:04 ?        00:00:00 /usr/sbin/named -u named
root     24859 22025  0 10:06 pts/1    00:00:00 grep named

June 8 corresponds to when the dynamic update errors commenced:

/var/log/messages-20180610:Jun  8 13:37:38 dns03 named[3729]:
malformed transaction:  . . .

How this problem impacted DNSSEC I do not understand but the evidence
is that it did.  The solution to the twinned named service problem was
to first kill the named process from June 8 and then restart the
remaining named service.  We will see if the DNS problem reoccurs.

But, the only change required to get the mail delivered was switching
the security level from 'dane' to 'may'. Which will have to go back to
dane once I am convinced that our underlying problems with DNSSEC has
been resolved.

-- 
***          e-Mail is NOT a SECURE channel          ***
        Do NOT transmit sensitive data via e-Mail
 Do NOT open attachments nor follow links sent by e-Mail

James B. Byrne                mailto:byrn...@harte-lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3

Reply via email to