Matthieu Herrb wrote: > Hi, > > before trying to implement it, I'd like to seek opinions on the sanity > of the following: > > most resolver libs have quite long timeout on the DNS server they > query, and generally start again from the 1st one in their > configuration (typically /etc/resolv.conf) for each name resolution. > So when the 1st name server is down, the impact on client machines is > really noticeable and make users complain.
yep. DNS is highly redundant, but you might end up waiting a bit... > So I would like to implement some kind of replication using carp to > ensure that the ip address listed in the client configuration will > always answer. > > First I'm making sure that this server is a recursive, caching only > name server. The authoritative server is separate, and for him the > multiple NS records (with one master and some slaves) works well. > > I'm using net/unbound to implement the server, but still I don't trust > it enough to consider that as long the interface on one machine > running unbound is up and getting carp advertisements the name server > is answering. So I'm considering to use ifstated to monitor the > unbound process and demote the interface if something goes wrong. > > Does this look sane ? > > If someone has already implemented something similar, I'd like to ear > about it (and may be to see sample ifstated.conf that implement it). > > Hint if someone wants to do the same: in unbound.conf you have to > explicitly set 'interface:' to the IP of your carp group (setting > outgoing-interface is not enough) , otherwise unbound will answer from > the IP of the carpdev interface. I did this (CARPed DNS resolvers) with CARP and djbdns a few years ago, and been using it quite a while, unfortunately only in my home network (which has some complexity, but not massive amounts of DNS traffic). CARP+djbdns works great and the system stays answering DNS queries when upgrading the systems, so the basic idea is sound. (Regarding your hint, however... I know the one problem I did have with djbdns is it only listens on one interface at a time, so monitoring that both dnscache services were running was a bit tricky as it was bound to the CARP interface. "tricky" = don't seem to have come up with a good solution, though I don't think I tried very hard or long -- I have several more ideas, just haven't tried them) As for monitoring DNS services to make sure the DNS app itself don't fall over and force that interface down if it does, I'm quite torn. On the one side, yes, it's a risk and a failure point. However, I'd not want to be running a DNS server I felt had a reasonable likelihood of falling over and not answering queries. So, if we grant that the DNS resolver APP crapping out is unlikely, I'd be more concerned about the complexity of the solution than the actual failure of the app and the survival of the machine, or the app failing in some way that my monitoring system didn't anticipate. If the app failing is considered likely, then I'd look for a different solution. Though yes, it looks like it is time for me to start looking at moving to unbound... Nick.