On Mon, 15 Aug 2016, blrmaani wrote:
I inherited a DNS server which is running BIND 9.8.x. There was a DNS incident where our customers complained that they saw query timeouts intermittently (Our customers run cassandra/hadoop applications and send same queries repeatedly). They also run nscd on their hosts but I was told all have same TTL value of 3600 indicating all names expire at the same time on thousands of client hosts).

I tried to reproduce the issue by sending hostname.bind queries and I see logs similar to the one below:

<time> <client-hostname> named[<pid>]: limit responses to <subnet> for hostname.bind 
CH TXT <hex-number>
<time> <client-hostname> named[<pid>]: *stop limiting responses to <subnet> for 
hostname.bind CH TXT <hex-number>

I reviewed /etc/named.conf and do not see 'rate-limit' configuration. I am confused because BIND ARM says rate-limit is disabled by default. But logs indicate otherwise.

The built-in view for the "CH" class has response rate-limting (RRL) enabled by default. It's possible to override it, but it might not help you any. Basically, your test queries are sufficiently different than normal queries that your test methodology is probably invalid.

Do you see RRL log messages for normal queries? If not, then RRL is probably not your trouble. Other things like insufficient UDP buffering, lacking CPU horsepower, or overwhelmed iptables connection tracking can also cause time-outs.

________________________________________________________________________
Jay Ford, Network Engineering Group, Information Technology Services
University of Iowa, Iowa City, IA 52242
email: jay-f...@uiowa.edu, phone: 319-335-5555
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to