Re: Request for review of performance advice

Browne, Stuart via bind-users Tue, 07 Jul 2020 19:42:50 -0700

Just one quick one before I run off to lunch with regards to section 2:

- Try to avoid crossing NUMA boundaries. At high throughput, the context 
switching and far memory calls kills performance.

Stuart

From: bind-users <bind-users-boun...@lists.isc.org> on behalf of Victoria Risk 
<vi...@isc.org>
Date: Wednesday, 8 July 2020 at 11:58
To: bind-users <bind-users@lists.isc.org>
Subject: Request for review of performance advice

A while ago we created a KB article with tips on how to improve your 
performance with our Kea dhcp server. The tips were fairly obvious to our 
developers and this was pretty successful. We would like to do something 
similar for BIND, provide a dozen or so tips for how to maximize your 
throughput with BIND. However, as usual, everything is more complicated with 
BIND.

Can those of you who care about performance, who have worked to improve your 
performance, share some of your suggestions that have the most impact?  Please 
also comment if you think any of these ideas below are stupid or dangerous. I 
have combined advice for resolvers and for authoritative servers, I hope it is 
clear which is which...

The ideas we have fall into four general categories:

System design
1a) Use a load balancer to specialize your resolvers and maximize your cache 
hit ratio.  A load balancer is traditionally designed to spread the traffic out 
evenly among a pool of servers, but it can also be used to concentrate related 
queries on one server to make its cache as hot as possible. For example, if all 
queries for domains in .info are sent to one server in a pool, there is a 
better chance that an answer will be in the cache there.

1b) If you have a large authoritative system with many servers, consider 
dedicating some machines to propagate transfers. These machines, called 
transfer servers, would not answer client queries, but just send notifies and 
process IXFR requests.

1c) Deploy ghost secondaries.  If you store copies of authoritative zones on 
resolvers (resolvers as undelegated secondaries), you can avoid querying those 
authoritative zones. The most obvious uses of this would be mirroring the root 
zone locally or mirroring your own authoritative zones on your resolver.

we have other system design ideas that we suspect would help, but we are not 
sure, so I will wait to see if anyone suggests them.

OS settings and the system environment
2a) Run on bare metal if possible, not on virtual machines or in the cloud. 
(any idea how much difference this makes? the only reference we can cite is 
pretty out of date - 
https://urldefense.com/v3/__https:/indico.dns-oarc.net/event/19/contributions/234/attachments/217/411/DNS_perf_OARC_Apr_14.pdf__;!!N14HnBHF!rk-RfzR0chw8mToGMWAwQAF_WiiXKZM3KXol3WR8YPytPoI_cWyNe5BZ_rsEqdV7T9SIQ1M$
 )

2b) Consider using with-tuning-large. 
(https://urldefense.com/v3/__https:/kb.isc.org/docs/aa-01314__;!!N14HnBHF!rk-RfzR0chw8mToGMWAwQAF_WiiXKZM3KXol3WR8YPytPoI_cWyNe5BZ_rsEqdV7ufSMbnU$)
 This is a compile time option, so not something you can switch on and off 
during production. 

2c) Consider which R/W lock choice you want to use - 
https://urldefense.com/v3/__https:/kb.isc.org/docs/choosing-a-read-write-lock-implementation-to-use-with-named__;!!N14HnBHF!rk-RfzR0chw8mToGMWAwQAF_WiiXKZM3KXol3WR8YPytPoI_cWyNe5BZ_rsEqdV7mVVUg4A$
 For the highest tested query rates (> 100,000 queries per second), pthreads 
read-write locks with hyper-threading enabled seem to be the best-performing 
choice by far.

2d) Pay attention to your choice of NIC cards. We have found wide variations in 
their performance. (Can anyone suggest what specifically to look for?)

2e) Make sure your socket send buffers are big enough. (not sure if this is 
obsolete advice, do we need to tell people how to tell if their buffers are 
causing delays?)

2f) When the number of CPUs is very large (32 or more), the increase in UDP 
listeners may not provide any performance improvement and might actually reduce 
throughput slightly due to the overhead of the additional structures and tasks. 
We suggest trying different values of -U to find the optimal one for your 
production environment.

named Features
3a) Minimize logging. Query logging is expensive (can cost you 20% or more of 
your throughput) so don’t do it unless you are using the logs for something. 
Logging with dnstap is lower impact, but still fairly expensive. Don’t run in 
debug mode unless necessary. 

3b) Use named.conf option minimal-responses yes; to reduce the amount of work 
that named needs to do to assemble the query response as well as reducing the 
amount of outbound traffic

3c) Disable synth-from-dnssec. While this seemed like a good idea, it turns 
out, in practice it does not improve performance.

3d) Tune your zone transfers. 
(https://urldefense.com/v3/__https:/kb.isc.org/docs/aa-00726__;!!N14HnBHF!rk-RfzR0chw8mToGMWAwQAF_WiiXKZM3KXol3WR8YPytPoI_cWyNe5BZ_rsEqdV7K_7-VnQ$)
When tuning the behavior of the primary, there are several factors that you can 
control:
- The rate of notifications of changes to secondary servers (serial-query-rate 
and notify-delay)
- Limits on concurrent zone transfers (transfers-out, tcp-clients, 
tcp-listen-queue, reserved-sockets)
- Efficiency/management options (max-transfer-time-out, max-transfer-idle-out, 
transfer-format)
The most important options to focus on are transfers-out, serial-query-rate, 
tcp-clients and tcp-listen-queue.
4e) If you use RPZ, consider using qnane-wait-recurse. We have had issues with 
RPZ transfers impacting query performance in resolvers. In general, more 
smaller RPZ zones will transfer faster than a few very large RPZ zones. 

4f) Consider enabling prefetch on your resolver, unless you are running 9.10 
(which is EOL) 
https://urldefense.com/v3/__https:/kb.isc.org/docs/aa-01122__;!!N14HnBHF!rk-RfzR0chw8mToGMWAwQAF_WiiXKZM3KXol3WR8YPytPoI_cWyNe5BZ_rsEqdV714AsnkE$

Fix your transport network. 
Transport network issues cause BIND to keep retrying, which is a performance 
drain.
4a) Disable (in some cases, completely remove in order to prevent ongoing 
interference) outbound firewalls/packet-filters (particularly that maintain 
state on connections). These are a frequent cause of problems in the DNS that 
can cause your DNS server to do a lot of extra work. 

4b) Set an appropriate MTU for your network. Ensure that your network 
infrastructure supports EDNS and large UDP responses up to 4096. Ensure that 
your network infrastructure allows transit for and reassembly of fragmented UDP 
packets (these will be large query responses if you are DNSSEC signing)

4c) Ensure that your network infrastructure allows DNS over TCP.

4d) Check for, and eliminate any incomplete IPv6 interface set-up (what can go 
wrong here is that BIND thinks that it can use IPv6 authoritative servers, but 
actually the sends silently fail, leaving named waiting unnecessarily for 
responses)

Any further suggestions, corrections or warnings are very welcome. 

Thank you!
Vicky

---------

Victoria Risk
Product Manager
Internet Systems Consortium
mailto:vi...@isc.org

_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe 
from this list

ISC funds the development of this software with paid support subscriptions. 
Contact us at https://www.isc.org/contact/ for more information.

bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: Request for review of performance advice

Reply via email to