On 8/31/22 11:56, Drew Weaver wrote:

Hello,

We have a cluster of Bind9 resolvers behind load balancers (for historical reasons, mainly that we can’t force people to use multiple resolver IP addresses in their configurations(static) and everything still has to work).

The load balancers do health checks to determine whether or not the hosts are responding to queries and then based the result of those checks the individual hosts are rotated in and out of operation.

We noticed that some of these health checks are failing (seemingly at random) and hosts are flapping in and out of the SLB pool, but we cannot actually figure out why those queries are failing.

43/1656 queries resulted in DNS mesg recv: no answ section

Our environment is EL7 running BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.9

Checking standard logging channels the only real error we see from named is this:

“named[5821]: dispatch 0x7f70e400fad0: shutting down due to TCP receive error: (seemingly random IP address) connection reset” but the source IP that the health checks come from don’t appear anywhere in the logs.

We read through this document https://kb.isc.org/docs/monitoring-recommendations-for-bind-9 which gave us some good ideas on things to look at but sadly there doesn’t appear to be anything sticking out at us as a real cause.

If anyone has any thoughts on this I would be really grateful.


Interesting that it fails on TCP, not UDP. "netstat -s" might
show something useful?

It might help to describe your load balancer setup: make/model,
software revision level, how you set up the health checks, how
the load balancers failover is configured. Has this behavior
started recently? Have there been any load balancer configuration
changes?

Best regards,
--
Charles Polisher
-- 
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from 
this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.


bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to