Hi, we got reports about a temporary resolution failure for some names under norid.no this morning. Digging through the logs, the first instance appears to be
Apr 24 08:35:02 resolver named[244]: validating zabbix-test.norid.no/CNAME: bad cache hit (norid.no/DNSKEY) and a couple of minutes later, a rash of entries pointing to the same bad cache hit. The last entry after this pattern was some 10 minutes later. Looking at the code in BIND 9.14.10 (BIND 9.16.2 doesn't appear to be significantly different in this regard), there appears to be a "cache of bad records" implemented by lib/dns/badcache.c. There are two invocations of dns_resolver_addbadcache() in lib/dns/resolver.c, with fairly complicated preconditions to reach each of those two points. However, it appears that if I have not turned on query tracing (we have not; I think we did previously, but found it to be a severe and noticeable performance hit), I will not get any logging indicating which of the two conditions hit, or why, so the trace for the root cause for why norid.no/DNSKEY was temporarily marked bad goes cold at this point as far as I can see. Our logging is configured to (among other things) log the "dnssec" and "security" categories at severity info and higher. Is there something which can be done to improve the diagnostics for such situations? I don't suppose there is anything more to be found for this particular problem at the moment? Regards, - HÃ¥vard _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users