Thanks for the followup. > > NXDOMAIN is not a "failure" response. Are you *sure* you're getting NXDOMAIN?
Yes. Pretty sure. With hindsight I should have run the tests inside a 'script' session. > If you're using nslookup to test, be aware that it will do suffix searching > by default, so if the original query, e.g. www.bbc.co.uk fails, it'll > quietly (unless debug-mode is in effect) start appending suffixes. Looking up > those suffixed names, e.g. www.bbc.co.uk.example.com, mostly likely gets an > NXDOMAIN, so nslookup reports NXDOMAIN as the overall result of the query. > So, it's basically a misreporting of the error by nslookup. Yes. I was mostly using nslookup. I'll try dig too next time this occurs. > > Note that only 1 of the records in your cache dump is actually relevant -- > the CNAME from www.bbc.co.uk to www.bbc.net.uk -- and the others are for a > different part of the namespaces (thdow.bbc.co.uk). I'll contact you privately with a link to the whole cache. Every entry tagged 'pending-*' in the cache which I tried querying failed to resolve when queried, many hours after the network congestion had ended. > > If you do an explicit query of the CNAME, when the problem is occurring, does > it resolve? I would expect, even though the cache entry is marked > "pending-answer", it will still resolve. But, without the target of the CNAME > also resolving, the lookup as a whole cannot succeed. I'll try that next time. Regards Tom. > > > - Kevin > > -----Original Message----- > From: bind-users-boun...@lists.isc.org > [mailto:bind-users-boun...@lists.isc.org] On Behalf Of > tpcb...@mklab.ph.rhul.ac.uk > Sent: Tuesday, January 26, 2016 8:02 PM > To: bind-users@lists.isc.org > Subject: Name resolution failure on a caching server -- many '; > pending-answer' records in the cache > > Dear All, > I run a caching server on a section of the departmental LAN. > Occasionally network congestion results in timeouts & name resolution > failures. Lookups performed on name servers outside my LAN section fail with > NXDOMAIN. Querying my name server for items not in its cache gets the same > result. > > My problem is that long after the congestion has subsided, queries to my name > server still result in NXDOMAIN failure. AFAICT this situation remains > indefinitely, until the cache is flushed 'rndc flush' or the bind restarted. > When it is in this state dumping the cache with 'rndc dumpdb' shows numerous > entries like this, > > -------------------------------------------------------------------------------------------- > ; pending-additional > thdow.bbc.co.uk. 76632 NS ns3.bbc.net.uk. > 76632 NS ns4.bbc.co.uk. > 76632 NS ns4.bbc.net.uk. > 76632 NS ns3.bbc.co.uk. > ; pending-answer > ns0.thdow.bbc.co.uk. 2082 \-AAAA ;-$NXRRSET > ; thdow.bbc.co.uk. SOA ns.bbc.co.uk. hostmaster.bbc.co.uk. 2015122100 1800 > 600 864000 86400 ; pending-answer > 76632 A 212.58.240.162 > ; pending-answer > www.bbc.co.uk. 30 CNAME www.bbc.net.uk. > ; glue > -------------------------------------------------------------------------------------------- > > and attempts to lookup eg. www.bbc.co.uk result in NXDOMAIN. > > Browsing the documentation I noticed the parameter 'max-ncache-ttl' > which is unset in my named.conf and apparently defaults to 3hours. > However the problem persists long after 3hours has elapsed following > incidents of network congestion. > > I could setup a cronjob to check name resolution on external domains and > flush the cache when it fails? I am assuming there must be better solution! > Should I set max-ncache-ttl to something fairly short in my named.conf and > hope that the default value is for some reason actually > >> 3hours? > > BTW I there a way to dump out all the parameters from a running named > -- just to see all their values ? > > > Any ideas on how to solve or further diagnose the problem? > > Many thanks > Tom Crane > > System details: > OS: Scientific Linux CERN SLC release 6.7 (Carbon) [NB: SLC is a > derivative of RHEL] > BIND: bind-9.8.2-0.37.rc1.el6_7.5.x86_64 > > Ps. I originally posted in Usenet NG comp.protocols.dns.bind but got no > followups and then noticed all messages in that NG had this ML's fields > 'NNTP-Posting-Host: lists.isc.org' and 'X-Original-To: > bind-users@lists.isc.org' etc. in their headers. Is c.p.d.b actually a > moderated group now or exclusively tied to this ML via a mail2news gateway? > > -- > Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill, > Egham, Surrey, TW20 0EX, England. > Email: T dot Crane at rhul dot ac dot uk > > _______________________________________________ > Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe > from this list > > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users > _______________________________________________ > Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe > from this list > > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users > -- -- Tom Crane, Dept. Physics, Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 0EX, England. Email: t.cr...@rhul.ac.uk Fax: +44 (0) 1784 472794 _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users