I have an intermittent RPZ problem that I am troubleshooting.

I do a lookup for "dnshealthcheck.privatelink.azurewebsites.net" which
has a corresponding RPZ entry that looks like:
dnshealthcheck.privatelink.azurewebsites.net      A       10.254.254.254

A little after midnight, we started getting timeouts when looking up
the name, as well as SERVFAILs:
$ dig dnshealthcheck.privatelink.azurewebsites.net @ns02
;; communications error to ns02#53: timed out

; <<>> DiG 9.18.28-S1 <<>> dnshealthcheck.privatelink.azurewebsites.net @ns02
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 13401
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
; COOKIE: a960c89ec8cb08b10100000067781f4693fa76a27ba9bf46 (good)
;; QUESTION SECTION:
;dnshealthcheck.privatelink.azurewebsites.net. IN A

;; Query time: 4992 msec
;; SERVER: ns02#53(10.53.247.71) (UDP)
;; WHEN: Fri Jan 03 10:32:54 MST 2025
;; MSG SIZE  rcvd: 101

Turning up the logging for query-errors on one of the secondaries (I
thought it was a zone transfer issue at the time) I saw these:
03-Jan-2025 10:46:48.863 query-errors: debug 3: client @0x7f67c0469168
ns02#49282 (dnshealthcheck.privatelink.azurewebsites.net): view
Internal: rpz QNAME rewrite
dnshealthcheck.privatelink.azurewebsites.net stop on qresult in
rpz_rewrite(): timed out
03-Jan-2025 10:46:48.863 query-errors: debug 3: client @0x7f67b7aad168
ns02#41110 (dnshealthcheck.privatelink.azurewebsites.net): view
Internal: rpz QNAME rewrite
dnshealthcheck.privatelink.azurewebsites.net stop on qresult in
rpz_rewrite(): timed out
03-Jan-2025 10:46:48.863 query-errors: debug 1: client @0x7f67c0469168
ns02#49282 (dnshealthcheck.privatelink.azurewebsites.net): view
Internal: query failed (timed out) for
dnshealthcheck.privatelink.azurewebsites.net/IN/A at query.c:8113
03-Jan-2025 10:46:48.863 query-errors: debug 1: client @0x7f67b7aad168
ns02#41110 (dnshealthcheck.privatelink.azurewebsites.net): view
Internal: query failed (timed out) for
dnshealthcheck.privatelink.azurewebsites.net/IN/A at query.c:8113
03-Jan-2025 10:46:48.863 query-errors: debug 4: fetch completed at
resolver.c:4519 for dnshealthcheck.privatelink.azurewebsites.net/A in
10.000024: timed out/success
[domain:azurewebsites.net,referral:0,restart:2,qrysent:0,timeout:0,lame:0,quota:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0]

It seems queries for similar "A" records in the RPZ zone are also
failing. I don't think all of them are, but maybe. They do all seem to
be associated with azurewebsites.net from what I can see so far.

I looked at the source for "query.c" at line 8113 and it looks like a
logging function, and so doesn't seem to give me more information
about what was going on when the timeout happened, though I am
certainly not a programmer.

And at one point on one of the servers, I stumbled across this:

Jan  3 12:03:14 ns0 named[10233]: network unreachable resolving
'dnshealthcheck.privatelink.azurewebsites.net/A/IN':
2a01:111:4000:700::e0#53

Which surprised me. I figured if there was an "A" record for a name in
an RPZ zone that bind would never do a lookup for it, since it would
be overridden anyway.

That said, it seems to correlate with at least one of the nameservers
for azurewebsites.net being unreachable (ns1-224.azure-dns.com.
13.107.236.224). But that shouldn't matter since the resolver should
switch to another.

The issue ran for several hours, then resolved itself, then broke
again for a few hours, then resolved itself again and things seem to
be fine now. All without me doing anything (as far as I know, lots of
automation going on).

So does bind do a lookup of an RPZ name even if it is going to
override it? And also, if this happens again, are there places I
should look besides query-errors for indicators of why it is failing?

Thanks for any help,
  Adam Augustine
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to