Am 30.08.2011 00:04, schrieb Mark Andrews: > In message <4e5b6098.80...@pernau.at>, Klaus Darilion writes: >> Hi! >> >> I have 9.7.0-P1 as slave configured with two masters: M1 and M2. M2 is >> currently down. >> >> When M1 sends a NOTIFY to inform the salve of the new zone, bind starts >> querying for the SOA record at M2. As M2 is down, bind sends >> retransmissions and tries it several times. It takes up to 2 minutes >> until bind starts asking M1 - then the transfer of course works fine. >> >> The question is: can I tweak bind to fail-over to other master servers >> faster? > > try-tcp-refresh no;
Hi Mark! Thanks for the hint. But I do not see how this can help us, as the slave never used TCP. The SOA lookups are always done via UDP. Some more debugging showed, that the problem happens in the following scenario: 1. On the slave we have set max-refresh-time to 5 minutes. (We have added this in case the slave missed some NOTIFYs due to network problems). 2. Thus, every 4.5 minutes the slave asks both masters for the serial. The lookup to M1 works fine, the lookup to M2 of course fails as M2 is down and thus bind starts with retransmissions: every lookup has 2 retransmissions every 15 seconds, then bind this again with a new "transaction" 3. If bind receives a NOTIFY while it tries to query M2, the NOTIFY is more or less ignored: client 1.1.1.1#15733: received notify for zone 'xyz': TSIG 'foobar' zone xyz/IN: notify from 1.1.1.1: refresh in progress, refresh check queued Thus, it takes up two 2 minutes until bind gives up querying M2 and starting again with querying M1. Is it possible to tweak the retransmission timers and query timeouts when bind performs SOA lookups? Thanks Klaus _______________________________________________ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users