On 8/31/22 11:56, Drew Weaver wrote:
Hello,
We have a cluster of Bind9 resolvers behind load balancers (for
historical reasons, mainly that we can’t force people to use multiple
resolver IP addresses in their configurations(static) and everything
still has to work).
The load balancers do health checks to determine whether or not the
hosts are responding to queries and then based the result of those
checks the individual hosts are rotated in and out of operation.
We noticed that some of these health checks are failing (seemingly at
random) and hosts are flapping in and out of the SLB pool, but we
cannot actually figure out why those queries are failing.
43/1656 queries resulted in DNS mesg recv: no answ section
Our environment is EL7 running BIND 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.9
Checking standard logging channels the only real error we see from
named is this:
“named[5821]: dispatch 0x7f70e400fad0: shutting down due to TCP
receive error: (seemingly random IP address) connection reset” but the
source IP that the health checks come from don’t appear anywhere in
the logs.
We read through this document
https://kb.isc.org/docs/monitoring-recommendations-for-bind-9 which
gave us some good ideas on things to look at but sadly there doesn’t
appear to be anything sticking out at us as a real cause.
If anyone has any thoughts on this I would be really grateful.
Interesting that it fails on TCP, not UDP. "netstat -s" might
show something useful?
It might help to describe your load balancer setup: make/model,
software revision level, how you set up the health checks, how
the load balancers failover is configured. Has this behavior
started recently? Have there been any load balancer configuration
changes?
Best regards,
--
Charles Polisher
--
Visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from
this list
ISC funds the development of this software with paid support subscriptions.
Contact us at https://www.isc.org/contact/ for more information.
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users