Re: [DNSOP] Terminology: "primary master"

Paul Vixie Fri, 24 Nov 2017 15:52:35 -0800


Joe Abley wrote:

Hi Paul,
...
I presume you mean existing, not extant. ...


yes. thanks for understanding what i mean in spite of what i said.

The principal challenges I have seen using a non-trivial transfer
graph with NOTIFY and [AI]XFR are when a pair of candidate masters for a
particular slave each receive a new revision of a particular zone at a
different time. The one that receives it first sends NOTIFY messages to
its slaves; if slaves don't use the NOTIFY source to select the master
subsequently used to send SOA queries and [AI]XFR requests to (and it is
normal not to in many/most/all widely-used implementations) there is a
chance they will talk to the master that hasn't yet been updated and
hence not retrieve the new revision of the zone. Presumably all masters
update eventually, however, so normality will eventually be restored.


first let's clarify that this can't happen between the PM and its DM's,
because in my example, the PM was HA. i do know that not all NOTIFY

responders will try-first the server who sent them the NOTIFY, and thatthis can result in not learning a new serial number after a NOTIFY. the

BIND4 version of the NOTIFY code did this correctly, but i'm guessing
that RFC 1996 was not as clear on this topic as it might have been.

my preferred solution to this woe is to fix the NOTIFY responders, butas we know this is often outside our powers. my actual solutions to thishave been to either delay all NOTIFY's for some amount of time that'slong enough to get the DM's to all settle, and make sure that thesettle-time is as short as i can make it. or better still, give theslave-operators only one address for "zone master", and anycast that oneto get redundancy.

as an aside, the computed random delay-before-notify was not specifiedas a way to get a longer settling time; it was just to prevent all ofthe recipients of a NOTIFY-burst from doing their AXFR's in parallel.this was damnfoolishness related to BIND4's "fork on AXFR" logic, and ideeply regret it. 1996 was my first RFC and it shows.

The state involved in processing NOTIFY messages, especially when
there's loss in a path somewhere which triggers timeouts, back-off and
retransfer, can be painful to manage at scale. I have seen some abuse of
the specification in the form of fire-and-forget NOTIFYs to avoid the
state problem (discard NOTIFY responses received and never attempt to
retry sending, assuming that if ours didn't get through a NOTIFY from
another master probably will).

alas, many operators solve only the problem they know they're having,without regard to the ones they don't know about, and without regard forthe problems their solutions might cause for others. "monkey DNA".

Davey Song described a scenario that played out in the Yeti project
where multiple masters each serving zones that were equivalent but
different caused problems to downstream IXFR clients. The zones were
equivalent in the sense that they only differed in RRSIGs -- each server
used a local ZSK, corresponding properly to an RR in the DNSKEY RRSet,
to sign all the non-DNSKEY RRSets -- but different in the sense that the
RRSIGs were different, so really they were different zones with the same
origin. IXFR clients got confused, which I think in retrospect was
probably to be expected. This feels like the kind of scenario that Tony
was talking about in his earlier message in this thread.


yeti is probably a degenerate non-teaching corner case. since it deals
with the root zone, we had to politicize the design -- there are three

DM's whose only shared data is the IANA source zone they start with, andthe KSK they sign ZSK's with. we had to avoid annointing any one partyas having a special relationship to root-like zone data. this isunlikely to be a problem for anyone else or for any other zone.

see http://family.redbarn.org/~vixie/mz.pdf ...

I used an earlier version of that back in the day, as you might
remember. Pretty much everything I've been involved in since then has
used out-of-band provisioning of zone targets though, using devopsy
automation tools (ansible, chef, puppet) or provisioning databases that
were replicated independently of the DNS.

that's what a lot of the professionals do. however, i won't recommendtechnology that hasn't been documented and tested for interoperability,portability, and lack of IPR encumbrance.

One notable exception used
supermasters in PowerDNS, which you could call in-band (but still a
different approach from
metazones)

MZ was influenced by BIND, and only implemented for BIND, but was meantto be name-server independent. i would like to see something like thisgain broad industry traction, so that we can all spend our time solvingthings that somebody else hasn't already solved.


--
P Vixie

_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Re: [DNSOP] Terminology: "primary master"

Reply via email to