In message <4b701ec5.6060...@arcelormittal.com>, Cedric Lejeune writes: > This is a multi-part message in MIME format. > --------------010501020309000405000509 > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > Content-Transfer-Encoding: 7bit > > Hello list, > > Sorry to bother you but I really need help since I cannot figure out > what I am doing wrong. I am trying to set up a new DNS server: it > behaves as expected in a test environment, but in a production > environment, it seems to get overloaded, the number of recursive clients > increases until it reaches recursive-clients, a lot of timeouts occure > and the server is no more able to answers to any query. The main clients > of this server are spam filters (spamassassin) and mail routers. I have > googled for this issue and the only thing I have found that may explain > this issue is that our firewalls are mishandling packets > fragmentation/size larger than 512 bits. So I have checked this using > this thread > http://groups.google.com/group/comp.protocols.dns.bind/browse_thread/thread/cfa8c63ec6bd08d6 > > and it seems everything is fine. So, as a last resort, I bother you... > Do you have any hint that would help me to track down what is wrong? > > Thank you for your help. > > Kind regards, > > cedric. > > Possibly usefull informations: > > System: Debian testing > Bind version: 9.6.1.dfsg.P1-1 > > --------%<--------%<--------%<--------%<--------%<--------%<--------%<-------- > > named.conf > > // This is the primary configuration file for the BIND DNS server named. > // > // Please read /usr/share/doc/bind9/README.Debian.gz for information on the > // structure of BIND configuration files in Debian, *BEFORE* you customize > // this configuration file. > // > // If you are just adding zones, please do that in > /etc/bind/named.conf.local > > include "/etc/bind/named.conf.options"; > > include "/etc/bind/named.conf.local"; > > --------%<--------%<--------%<--------%<--------%<--------%<--------%<-------- > > named.conf.options > > logging { > channel debug { > file "/tmp/debug"; > severity debug 2; > print-category yes; > print-time yes; > print-severity yes; > }; > > category default { > debug; > }; > }; > > options { > directory "/var/cache/bind"; > > // If there is a firewall between you and nameservers you want > // to talk to, you may need to fix the firewall to allow multiple > // ports to talk. See http://www.kb.cert.org/vuls/id/800113 > > // If your ISP provided one or more IP addresses for stable > // nameservers, you probably want to use them as forwarders. > // Uncomment the following block, and insert the addresses > replacing > // the all-0's placeholder. > > // forwarders { > // 0.0.0.0; > // }; > > auth-nxdomain no; // conform to RFC1035 > // listen-on-v6 { any; }; > > allow-transfer { > X.X.X.X; > Y.Y.Y.Y; > }; > > allow-query-cache { any; }; > allow-recursion { any; }; > > querylog yes; > > recursive-clients 2000; > }; > > --------%<--------%<--------%<--------%<--------%<--------%<--------%<-------- > > named.conf.local > > // > // Do any local configuration here > // > > // Consider adding the 1918 zones here, if they are not used in your > // organization > // include "/etc/bind/zones.rfc1918"; > > include "/etc/bind/zone.hint"; > include "/etc/bind/zones.rfc1912"; > include "/etc/bind/zones.rfc1918"; > include "/etc/bind/zones.master"; > include "/etc/bind/zones.slave"; > > --------%<--------%<--------%<--------%<--------%<--------%<--------%<-------- > > /etc/default/bind9 > > # run resolvconf? > RESOLVCONF=yes > > # startup options for the server > OPTIONS="-4 -u bind" > > --------%<--------%<--------%<--------%<--------%<--------%<--------%<-------- > > # dig +norec +dnssec www.google.com @a.root-servers.net > > ; <<>> DiG 9.6.1-P1 <<>> +norec +dnssec www.google.com @a.root-servers.net > ;; global options: +cmd > ;; Got answer: > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55758 > ;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 13, ADDITIONAL: 16 > > ;; OPT PSEUDOSECTION: > ; EDNS: version: 0, flags: do; udp: 512 > ;; QUESTION SECTION: > ;www.google.com. IN A > > ;; AUTHORITY SECTION: > com. 172800 IN NS a.gtld-servers.net. > com. 172800 IN NS b.gtld-servers.net. > com. 172800 IN NS c.gtld-servers.net. > com. 172800 IN NS d.gtld-servers.net. > com. 172800 IN NS e.gtld-servers.net. > com. 172800 IN NS f.gtld-servers.net. > com. 172800 IN NS g.gtld-servers.net. > com. 172800 IN NS h.gtld-servers.net. > com. 172800 IN NS i.gtld-servers.net. > com. 172800 IN NS j.gtld-servers.net. > com. 172800 IN NS k.gtld-servers.net. > com. 172800 IN NS l.gtld-servers.net. > com. 172800 IN NS m.gtld-servers.net. > > ;; ADDITIONAL SECTION: > a.gtld-servers.net. 172800 IN A 192.5.6.30 > a.gtld-servers.net. 172800 IN AAAA 2001:503:a83e::2:30 > b.gtld-servers.net. 172800 IN A 192.33.14.30 > b.gtld-servers.net. 172800 IN AAAA 2001:503:231d::2:30 > c.gtld-servers.net. 172800 IN A 192.26.92.30 > d.gtld-servers.net. 172800 IN A 192.31.80.30 > e.gtld-servers.net. 172800 IN A 192.12.94.30 > f.gtld-servers.net. 172800 IN A 192.35.51.30 > g.gtld-servers.net. 172800 IN A 192.42.93.30 > h.gtld-servers.net. 172800 IN A 192.54.112.30 > i.gtld-servers.net. 172800 IN A 192.43.172.30 > j.gtld-servers.net. 172800 IN A 192.48.79.30 > k.gtld-servers.net. 172800 IN A 192.52.178.30 > l.gtld-servers.net. 172800 IN A 192.41.162.30 > m.gtld-servers.net. 172800 IN A 192.55.83.30 > > ;; Query time: 10 msec > ;; SERVER: 198.41.0.4#53(198.41.0.4) > ;; WHEN: Mon Feb 8 15:03:49 2010 > ;; MSG SIZE rcvd: 531
Good you are not blocking packets > 512. > --------%<--------%<--------%<--------%<--------%<--------%<--------%<-------- > > # dig +dnssec +norec +ignore dnskey se @A.NS.se > > ;; Query time: 48 msec > ;; SERVER: 192.36.144.107#53(192.36.144.107) > ;; WHEN: Mon Feb 8 15:04:52 2010 > ;; MSG SIZE rcvd: 1203 This one didn't reach the fragmentation threshold (1203 < 1500). SE have tuned their dnskey response in the last two years. Try "dig +dnssec +norec +ignore any . @l.root-servers.net" I get 1906 bytes which is well over the threshold. ;; Query time: 229 msec ;; SERVER: 2001:500:3::42#53(2001:500:3::42) ;; WHEN: Tue Feb 9 08:20:00 2010 ;; MSG SIZE rcvd: 1906 > --------%<--------%<--------%<--------%<--------%<--------%<--------%<-------- > > Log extract: > > ... > 08-Feb-2010 14:39:56.391 query-errors: debug 1: client X.X.X.X#12695: > query failed (SERVFAIL) for 11.94.88.195.dnsbl.sorbs.net/IN/A at > query.c:4619 > 08-Feb-2010 14:39:56.391 query-errors: debug 2: fetch completed at > resolver.c:3121 for 11.94.88.195.dnsbl.sorbs.net/A in 30.000143: timed > out/success [domain:dnsbl.sorbs.NET,referral:0,restart:1,qrysent:13,timeou > t:12,lame:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0] Run "dig +trace +dnssec 11.94.88.195.dnsbl.sorbs.net" from the box the recursive nameserver is on and see what happens. > 08-Feb-2010 14:39:56.392 query-errors: debug 1: client X.X.X.X#48028: > query failed (SERVFAIL) for euro-index.be/IN/A at query.c:4619 > 08-Feb-2010 14:39:56.392 query-errors: debug 2: fetch completed at > resolver.c:3121 for euro-index.be/A in 30.000085: timed out/success > [domain:.,referral:0,restart:1,qrysent:11,timeout:10,lame:0,neterr:0,badresp: > 0,adberr:0,findfail:0,valfail:0] Run "dig +trace +dnssec euro-index.be" and see what happens. > 08-Feb-2010 14:39:56.392 query-errors: debug 1: client X.X.X.X#48028: > query failed (SERVFAIL) for euro-index.be/IN/MX at query.c:4619 > 08-Feb-2010 14:39:56.393 query-errors: debug 2: fetch completed at > resolver.c:3121 for euro-index.be/MX in 30.000111: timed out/success > [domain:.,referral:0,restart:1,qrysent:11,timeout:10,lame:0,neterr:0,badresp > :0,adberr:0,findfail:0,valfail:0] Run "dig +trace +dnssec euro-index.be mx" and see what happens. > 08-Feb-2010 14:39:56.394 query-errors: debug 1: client X.X.X.X#48028: > query failed (SERVFAIL) for 218.208.78.194.dnsbl.sorbs.net/IN/A at > query.c:4619 > 08-Feb-2010 14:39:56.394 query-errors: debug 2: fetch completed at > resolver.c:3121 for 218.208.78.194.dnsbl.sorbs.net/A in 30.000152: timed > out/success [domain:dnsbl.sorbs.NET,referral:0,restart:1,qrysent:13,time > out:12,lame:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0] > 08-Feb-2010 14:39:56.396 query-errors: debug 1: client X.X.X.X#48028: > query failed (SERVFAIL) for 218.208.78.194.zen.spamhaus.org/IN/A at > query.c:4619 > 08-Feb-2010 14:39:56.396 query-errors: debug 2: fetch completed at > resolver.c:3121 for 218.208.78.194.zen.spamhaus.org/A in 30.000175: > timed out/success > [domain:zen.spamhaus.org,referral:0,restart:1,qrysent:22,ti > meout:21,lame:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0] > 08-Feb-2010 14:39:56.396 query-errors: debug 1: client X.X.X.X#48028: > query failed (SERVFAIL) for euro-index.be.fulldom.rfc-ignorant.org/IN/A > at query.c:4619 > 08-Feb-2010 14:39:56.396 query-errors: debug 2: fetch completed at > resolver.c:3121 for euro-index.be.fulldom.rfc-ignorant.org/A in > 30.000098: timed out/success > [domain:rfc-ignorant.org,referral:0,restart:4,qrysen > t:4,timeout:3,lame:0,neterr:0,badresp:0,adberr:4,findfail:0,valfail:0] > 08-Feb-2010 14:39:56.417 query-errors: debug 1: client X.X.X.X#12695: > query failed (SERVFAIL) for 11.94.88.195.zen.spamhaus.org/IN/A at > query.c:4619 > 08-Feb-2010 14:39:56.417 query-errors: debug 2: fetch completed at > resolver.c:3121 for 11.94.88.195.zen.spamhaus.org/A in 30.000161: timed > out/success [domain:zen.spamhaus.org,referral:0,restart:1,qrysent:22,time > out:21,lame:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0] > 08-Feb-2010 14:39:56.418 query-errors: debug 1: client X.X.X.X#12695: > query failed (SERVFAIL) for > ukrs238770.pur3.net.fulldom.rfc-ignorant.org/IN/A at query.c:4619 > 08-Feb-2010 14:39:56.418 query-errors: debug 2: fetch completed at > resolver.c:3121 for ukrs238770.pur3.net.fulldom.rfc-ignorant.org/A in > 30.000102: timed out/success [domain:rfc-ignorant.org,referral:0,restart:4, > qrysent:4,timeout:3,lame:0,neterr:0,badresp:0,adberr:4,findfail:0,valfail:0] > 08-Feb-2010 14:39:56.479 query-errors: debug 1: client X.X.X.X#35810: > query failed (SERVFAIL) for 227.228.181.88.combined.njabl.org/IN/A at > query.c:4619 > 08-Feb-2010 14:39:56.479 query-errors: debug 2: fetch completed at > resolver.c:3121 for 227.228.181.88.combined.njabl.org/A in 30.000118: > timed out/success [domain:combined.njabl.org,referral:0,restart:1,qrysent:1 > 1,timeout:10,lame:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0] > 08-Feb-2010 14:39:56.479 query-errors: debug 1: client X.X.X.X#35810: > query failed (SERVFAIL) for 3.42.27.212.combined.njabl.org/IN/A at > query.c:4619 > 08-Feb-2010 14:39:56.479 query-errors: debug 2: fetch completed at > resolver.c:3121 for 3.42.27.212.combined.njabl.org/A in 30.000156: timed > out/success [domain:combined.njabl.org,referral:0,restart:1,qrysent:10,t > imeout:9,lame:0,neterr:0,badresp:0,adberr:0,findfail:0,valfail:0] > ... > > --------------010501020309000405000509 > Content-Type: text/x-vcard; charset=utf-8; > name="cedric_lejeune.vcf" > Content-Transfer-Encoding: 7bit > Content-Disposition: attachment; > filename="cedric_lejeune.vcf" > > begin:vcard > fn:Cedric Lejeune > n:Lejeune;Cedric > org:ArcelorMittal Luxembourg;IT > adr:;;24-26 boulevard d'Avranches;Luxembourg;;L-1160;Luxembourg > email;internet:cedric.leje...@arcelormittal.com > title:System Administration Consultant > tel;work:+352 4792 2078 > tel;fax:+352 4792 89 2078 > x-mozilla-html:FALSE > url:http://www.arcelormittal.com > version:2.1 > end:vcard > > > --------------010501020309000405000509 > Content-Type: text/plain; charset="us-ascii" > MIME-Version: 1.0 > Content-Transfer-Encoding: 7bit > Content-Disposition: inline > > _______________________________________________ > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users > --------------010501020309000405000509-- -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org _______________________________________________ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users