Gerben Wierda wrote:
> WHen a client is rejected because of a mssing reverse hostname, I see:
> 
> Nov 21 15:37:02 mail smtp/smtpd[2168]: NOQUEUE: reject: RCPT from 
> unknown[46.221.40.2]: 450 4.7.1 Client host rejected: cannot find your 
> reverse hostname, [46.221.40.2]; from=<a95901...@rna.nl> 
> to=<a95901...@rna.nl> proto=ESMTP helo=<[46.221.40.2]>
> 
> And I was just wondering: why is that a 4xx message (temp failure) and not a 
> 5xx failure?

The DNS system is a marvelous distributed databases.  It is designed
to operate with redundancy.  However there may still be failures where
one cannot at that instant complete the DNS lookup successfully and
that means the answer is unknown.  Unknown answers are unknown.  Try
again and you might get an answer because perhaps the problem
preventing the answer is no longer present.  The configuration may be
changing.  The configuration may be insufficiently redundant.  The
configuration may be faulty.

For example DNS uses UDP packets due to UDP being the lightest weight
and lowest resource use of the IP (the IP Internet Protocol in
"TCP/IP" that we mostly here includes UDP too) packet types.  One end
sends a UDP packet out into the network with the hope that it
arrives.  And then another UDP packet is sent back in the hope that it
will return.  But let's say that a router in the middle is getting
attacked with an abuse.  It becomes overloaded.  The router in the
middle can become overloaded to the point that it will not be able to
deliver all of the packets of all types.  Some packets may time out
before being able to be delivered.  They will be discarded from the
router queue.

Let's say something like this happens right at the moment a DNS lookup
was attempted.  The DNS lookup would fail to contact the remote
servers of the distributed database and be unable to return an
authoritative answer.  The lookup would fail.  The local nameserver
will cache a negative cache lookup for 5 minutes (I think) so that it
doesn't add to the problem with repeated lookup attempts.  If the
lookup is retried an answer may be successfully provided on a
subsequent attempt.  The query side can't know if the failure is going
to persist or is simply a temporary failure.  Try again and it might
work at that next moment.  If insufficient redundancy was configured
such that the overloaded route might be avoided.  And if the overload
was not at a site border router.

Additionally someone may have configured the DNS zone database
information incorrectly.  Someone might then fix the information.
Which would make it work later.  Maybe they configured redundancy
poorly.  I have often seen "multiple" nameservers configured with all
on the same host system effectively cheating on the requirement for
redundancy making the only route path a single failure point.

Just earlier this summer I debugged a configuration error where
someone had set the secondary zone expiration timeout to 3600 seconds,
1 hour, instead of the more typical 2 weeks or so that it should
normally have.  And then the primary server was offline for 4 hours
due to a networking problem.  After one hour all of the secondary
nameservers globally stopped serving that domain, as they were
instructed to do, and all lookups for that domain failed.  When the
primary came back online everything started working normally again.
Redundancy was configured but incorrectly for only 1 hour.

In all of these cases it was desirable for the mail in transit to
simply queue and retry later.  In all of these cases mail delivered at
a later retry when things became functional and working again.

Bob

Reply via email to