Am 21.11.2019 um 11:47 schrieb Bind Mailinglist: > Hello Ondřej > Interesting case and not so easy to detect. But I was able to get a > few steps further. > As I have always to clear cache for host > tm.inregion.waas.oci.oraclecloud.net I focused monitoring on that. > 1. > On my caching servers I was tracing this host with wireshark. In most > cases my other servers replyed to the queries (most A, some CNAME) > with an other CNAME. > When the problem appears, the last reply was a SOA from my DNS server. > So why sends my DNS server such a SOA reply to the cache server? > 2. > So I was trying to do the same on my DNS servers. > And there all A queries for tm.inregion.waas.oci.oraclecloud.net were > replied from authoritative servers with a CNAME and a very dynamic > host. Maybe quite normal for this oracle cloud. > But there were a few CNAME queries for the same host. And for CNAME > queries I allways got an SOA answer. > About 1.5s my server queries again for an A record which has been > answered. > > What happens when my cache queries my DNS server for the same host at > the time between SOA reply and next A reply from the authoritative server? > > I can reproduce it like this: > > The CNAME query: > > $ dig @ns1.p17.dynect.net tm.inregion.waas.oci.oraclecloud.net CNAME > > ; <<>> DiG 9.9.5-3ubuntu0.19-Ubuntu <<>> @ns1.p17.dynect.net > tm.inregion.waas.oci.oraclecloud.net CNAME > ; (2 servers found) > ;; global options: +cmd > ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24630 > ;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 > ;; WARNING: recursion requested but not available > > ;; OPT PSEUDOSECTION: > ; EDNS: version: 0, flags:; udp: 4096 > ;; QUESTION SECTION: > ;tm.inregion.waas.oci.oraclecloud.net. IN CNAME > > ;; AUTHORITY SECTION: > inregion.waas.oci.oraclecloud.net. 1800 IN SOA > ns1.p17.dynect.net. hostmaster.inregion.waas.oci.oraclecloud.net. > 1574248545 3600 600 604800 1800 > > ;; Query time: 15 msec > ;; SERVER: 2001:500:90:1::17#53(2001:500:90:1::17) > ;; WHEN: Thu Nov 21 11:44:41 CET 2019 > ;; MSG SIZE rcvd: 127 > > > The A query: > > $ dig @ns1.p17.dynect.net tm.inregion.waas.oci.oraclecloud.net A > > ; <<>> DiG 9.9.5-3ubuntu0.19-Ubuntu <<>> @ns1.p17.dynect.net > tm.inregion.waas.oci.oraclecloud.net A > ; (2 servers found) > ;; global options: +cmd > ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55743 > ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 1 > ;; WARNING: recursion requested but not available > > ;; OPT PSEUDOSECTION: > ; EDNS: version: 0, flags:; udp: 4096 > ;; QUESTION SECTION: > ;tm.inregion.waas.oci.oraclecloud.net. IN A > > ;; ANSWER SECTION: > tm.inregion.waas.oci.oraclecloud.net. 30 IN CNAME > eu-switzerland.inregion.waas.oci.oraclecloud.net. > > ;; AUTHORITY SECTION: > inregion.waas.oci.oraclecloud.net. 86400 IN NS ns4.p17.dynect.net. > inregion.waas.oci.oraclecloud.net. 86400 IN NS ns3.p17.dynect.net. > inregion.waas.oci.oraclecloud.net. 86400 IN NS ns1.p17.dynect.net. > inregion.waas.oci.oraclecloud.net. 86400 IN NS ns2.p17.dynect.net. > > ;; Query time: 14 msec > ;; SERVER: 2001:500:90:1::17#53(2001:500:90:1::17) > ;; WHEN: Thu Nov 21 11:45:38 CET 2019 > ;; MSG SIZE rcvd: 255 > > But I'm still if that is my problem. > Regard Florian > > > > Am 20.11.2019 um 18:16 schrieb Ondřej Surý: >> The cache shows you that the forwarder reported that there’s no such record >> returned from the upstream resolvers. >> >> The NXRRSET means - Non-eXistant Resource Record Set, e.g. your resolvers >> cached the non-existence of the name returned from the upstream resolvers. >> >> The other option would be running the affected query against the upstream >> resolvers in a semi-tight loop and log the results. >> >> while true; do echo "$(date -R): $(dig +short IN A <domain> @<forwarder>)“; >> sleep 1; done >> >> Ondrej >> -- >> Ondřej Surý >> ond...@isc.org >> >>> On 21 Nov 2019, at 01:09, Bind Mailinglist <bindbandb...@ggaweb.ch> wrote: >>> >>> Hello Ondřej >>> Many thanks for your answer. Hope debugging can help me without server >>> overloading. >>> They have around 1500 queries/s peakload during eveninghours. It will need >>> some time to log exactly this effect. >>> At the moment I have the following lines disabled: >>> // forwarders { >>> // 213.160.41.2; >>> // 213.160.40.34; >>> // }; >>> About the AAAA answer. Does it matter if I query A or AAAA if there is only >>> a CNAME as an answer? >>> My last test shows me following cache entry. This has happend around 20min >>> after restarting bind with my forwarders enabled. >>> ; answer >>> tm.inregion.waas.oci.oraclecloud.net. 1697 \-A ;-$NXRRSET >>> Could a server timeout ends up in such a cache entry? Or does it need a >>> valid answer from the forwarders? What you think. >>> I tried to force forwarding by adding "forwarding only" but the result was >>> the same. >>> >>> Regards Florian >>> >>> >>> Am 20.11.2019 um 11:58 schrieb Ondřej Surý: >>>> Hi, >>>> >>>> you mentioned “forwarders” - what are these and how does AAAA answer look >>>> like on the upstream forwarders? >>>> >>>> I would recommend enabling higher debug level (start with -d 1) and look >>>> into logs what was the answer from the forwarders preceding the failure. >>>> >>>> Ondrej >>>> -- >>>> Ondřej Surý — ISC >>>> >>>> >>>>> On 20 Nov 2019, at 18:44, Bind Mailinglist <bindbandb...@ggaweb.ch> >>>>> wrote: >>>>> >>>>> Hello list >>>>> I'm glad there is such an active list. Hope there is anybody out there >>>>> who can help me with my little problem. :-) >>>>> We are running six bind server ( all Ubuntu LTS 18.04 with bind 9.11.3 >>>>> ), so they are pretty up to date. >>>>> Three of them have authoritative zones, one is for testing and two are >>>>> just caching servers. And there starts my problem. >>>>> 1. It only appears on my caching servers and only if I use my other >>>>> servers as forwarders. >>>>> 2. At the moment the problem appears on my chaching servers I'm still >>>>> able to let it resolve through my forwarders. >>>>> 3. Only one organisation with several newspapers are affected. There may >>>>> be others but I don't know at the moment. >>>>> >>>>> Ok, all these newspapers are hosted on oraclecloud with short timers >>>>> around 30s. >>>>> >>>>> # dig >>>>> www.20min.ch >>>>> >>>>> ;; ANSWER SECTION: >>>>> >>>>> www.20min.ch >>>>> . 39 IN CNAME >>>>> tamedia.a.inregion.waas.oci.oraclecloud.net. >>>>> tamedia.a.inregion.waas.oci.oraclecloud.net. 16 IN CNAME >>>>> tm.inregion.waas.oci.oraclecloud.net. >>>>> tm.inregion.waas.oci.oraclecloud.net. 16 IN CNAME >>>>> eu-london.inregion.waas.oci.oraclecloud.net. >>>>> eu-london.inregion.waas.oci.oraclecloud.net. 28 IN A 138.1.82.213 >>>>> eu-london.inregion.waas.oci.oraclecloud.net. 28 IN A 147.154.234.67 >>>>> eu-london.inregion.waas.oci.oraclecloud.net. 28 IN A 147.154.228.138 >>>>> >>>>> # dig >>>>> www.tagesanzeiger.ch >>>>> >>>>> ;; ANSWER SECTION: >>>>> >>>>> www.tagesanzeiger.ch >>>>> . 113 IN CNAME cnp-a-cre-p.newsnetz.ch. >>>>> cnp-a-cre-p.newsnetz.ch. 113 IN CNAME >>>>> tamedia.a.inregion.waas.oci.oraclecloud.net. >>>>> tamedia.a.inregion.waas.oci.oraclecloud.net. 11 IN CNAME >>>>> tm.inregion.waas.oci.oraclecloud.net. >>>>> tm.inregion.waas.oci.oraclecloud.net. 12 IN CNAME >>>>> eu-switzerland.inregion.waas.oci.oraclecloud.net. >>>>> eu-switzerland.inregion.waas.oci.oraclecloud.net. 12 IN A 192.29.59.121 >>>>> eu-switzerland.inregion.waas.oci.oraclecloud.net. 12 IN A 192.29.58.46 >>>>> eu-switzerland.inregion.waas.oci.oraclecloud.net. 12 IN A 192.29.58.42 >>>>> >>>>> >>>>> Now if I use my caching servers with forwarders enabled I run quite >>>>> often into cases where resolving stops working for theses two domains at >>>>> the same time. >>>>> When I take a dump I see the following line: >>>>> ; answer >>>>> tm.inregion.waas.oci.oraclecloud.net. 893 \-AAAA ;-$NXRRSET >>>>> >>>>> I have to clear this host from cache to make it working again, for a few >>>>> minutes. >>>>> The stupid thing, this NXRRSET cache entry has a much higher lifetime. >>>>> And so resolving stops working on my caching servers for more then 15min. >>>>> >>>>> Any idea how I could find out why this happens? >>>>> There must be something between my DNS servers. They are in the same >>>>> network, so there is no firewall between. >>>>> >>>>> Many thanks and regards >>>>> Florian >>>>> >>>>> _______________________________________________ >>>>> Please visit >>>>> https://lists.isc.org/mailman/listinfo/bind-users >>>>> to unsubscribe from this list >>>>> >>>>> bind-users mailing list >>>>> >>>>> bind-users@lists.isc.org >>>>> https://lists.isc.org/mailman/listinfo/bind-users >>> _______________________________________________ >>> Please visit https://lists.isc.org/mailman/listinfo/bind-users to >>> unsubscribe from this list >>> >>> bind-users mailing list >>> bind-users@lists.isc.org >>> https://lists.isc.org/mailman/listinfo/bind-users >
_______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users