A review of: https://tools.ietf.org/html/draft-huston-kskroll-sentinel-04

This is not a blow-by-blow, nit picking review, but tries to dive into 
archtecture level issues:

1. I don't think the Root Zone should be specifically called out, this 
mechanism ought to work for any domain name.

The Intro has an example:

## store.  In particular, this response mechanism can be used to
## determine whether a certain Root Zone KSK is ready to be used as a
## trusted key within the context of a key roll by this resolver.

As an example, sure, but there seems to be the confusion that "the root is 
special" when it comes to the management of trust anchors, that notion, that 
any name is special, ought to be put to rest.

Not all deployments of the DNS protocol need to have the same namespace, I used 
to work with an operational inter-network with its own "everything".

The Root Zone is mentioned in a few places in the document, I haven't seen that 
it needs to be "called out" for this proposal, whatever the final result is, it 
should work anywhere in any DNS tree.

2. Need to reserve labels according to regular expression

All labels matching <_is-ta-*> and <_not-ta-*> would have to be reserved to 
prevent a collision with a configured name.  As we don't know the future key 
tags we have to clear them all out.  For the root zone, that's a swath of TLDs. 
The same is true for all names (the root is not special!) where some validator 
has a trust anchor. (That raises an interesting point - how does a zone 
operator know whether their zone apex is considered to be a point of trust?  
This is an open link when thinking of Automated Updates of DNSSEC Trust Anchors 
as a protocol between two entities.)  There is no in-DNS-band indication by a 
zone administrator that they expect to implement timings compatible with STD 
69, a trust anchor manager needs out of band indications (and updates).

Note that we don't have a protocol definition for matching "partial" labels.  
Yes, that's a problem.

3. Structure of the draft

The draft covers two things - the query/response protocol and the use of the 
results in a test.  This I find confusing:

## 3.  Sentinel Processing
##
##    This proposed test that uses the DNS resolver mechanism described in

Until I got to that line, I was expecting that this document covered only the 
query/response protocol.  I'll structure the rest of the review into 
"query/response" and then "use of that in testing."

4. The query/response protocol

I'm a little unclear, from the description, what is happening at the 
query/response level.

Let's say (for the sake of this email), I want to ask whether my resolver 
(which is a fuzzy statement) has a trust anchor for example.com, key_id=0x4034. 
I would then send a query for (_is-ta-4034{*}.example.com./IN/A), flags CD off 
and EDNS option DO, and expect a response of either:

-1. A response with a return code of SERVFAIL (value=2), which would imply that 
either the responder is saying "no" to the trust anchor status of the key or 
that there is data matching the query but it has either failed DNSSEC 
validation or the servers authoritative for the name could not (ultimately) be 
reached.

-2. A positive response with data, indicating that the responder treated this 
as a normal query and happened to have query-matching data to return.

-3. A negative response, either a name error (return code=NXDOMAIN) or an empty 
answer section, indicating the queried name exists and has other data.  In 
either case, the responder is treating the query as "normal."

-4. An indication that the key is held as trusted by the responder.  This is 
where I'm lost - what is returned?  The draft says:

##    ..., then the resolver should return a response indicating
##    that the response contains authenticated data according to section
##    5.8 of [RFC6840].  ...

Dereferencing the pointer, there's mumbo-jumbo about the AD bit value in the 
response.

Perhaps my confusion begins here, in earlier text in "Sentinel Mechanism":

##    If the outcome of the DNS response validation process indicates that
##    the response is authentic, and if the left-most label of the original
##    query name matches the template "_is-ta-<tag-index>.", then the

DNSSEC validation is not performed on responses, it is performed on RRsets.  
DNSSEC does not check the headers, does not make a blanket statement over the 
sections (answer, authority, etc.), does not cover the EDNS stuff (OPT record 
in the additional), etc.  This sentinel mechanism has to have an answer, and it 
be signed (up to some trust anchor on the responder) for there to be 
validation.  Would this be an IP address (v4/v6 for A/AAAA as appropriate)?  
What would that address be, given this is not meant to "flow data"?  
(127.0.0.1/8 for IPv4, IPv6 only has ::1/128 for local.)

For the negative case, I'm more at sea.  For _not-ta-1111.example.com. to be 
able to return a set that DNSSEC validation can approve there would need to be 
a - as yet not defined - wildcard that matches prefixes of labels.  There would 
have to be fake address records for _not-ta-*.example.com. for all values that 
didn't match held trust anchors.  Again, with the validator determining the 
existence of a trust anchor and not the authority, this is hard to generate at 
the zone administration's DNSSEC signing time.

One factor not covered is the setting of the recursion desired bit, and this 
might be useful.  By clearing the RD bit (RD=0), the responder ought to be 
consulting only it's local information.  For responders unaware of this 
feature, they'd only consult their local cache, which ought to never have an 
entry for the queried name (expecting that the authority neither configures the 
name, with or without a delegation, nor has a wildcard).

With a cleared RD bit in the query handled by a responder that is aware of this 
mechanism, the answer would be something generated from data in the responder's 
trust anchor data store.  For a positive response to a positive query (i.e., 
yes to _is-ta-*) if something is to be sent back signed, a valid signature is 
needed.  For a negative response to a positive query, ... I don't know.

What if the query/response relies on having RD=0 in the query, hoping the 
responder does not send the query onward (recursion or forwarding), and we use 
some form of transaction security (like TSIG) for hop-by-hop security?

For unaware responders, they'd treat the query as one for an address record.  
Unless there's a matching name in the cache, they'd never return a data set.  
Quick check: BIND 9.9.5 returns a referral to the root, unbound 1.5.8 responds 
with return code of REFUSED.  Different answers from different implementations, 
hmmm, would make this a bit more difficult.

To me, the basics of the query/response mechanism aren't very clear, an example 
would certainly help.  (And not one sitting at the root!)

Consider how the flags in the response might be used - particularly the 
recursion desired bit.

5. The "test"

In the sense of "walk before you crawl", start with the description of queries 
and responses to specific targets as a network manager would do.  To accurately 
use the results, one would need to know the specifics of DNS query husbandry in 
place.  This is something only the network manager could know.

First, target at an IP address.  There may be one or more responders.  There 
could be a load balancer before a number of independent processes answering on 
port 53. There could be an anycast constellation of servers with some routing 
instability.  TSIG could be used to "secure" the exchange.

Next, any single end-consumer of DNS services will likely have a list of IP 
addresses to target.  The different IP addresses may have different network 
managers running the DNS service (i.e., some do DNSSEC validation, some don't, 
some have special trust anchors, etc.).  That would lead to uncertainty in 
trying to "convert symptoms into a diagnosis."

There are two ways to test.  One assumes that the tester is the manager, one 
with knowledge of the layout.  A significant ingredient is that the tester, in 
this case, would have a list of IP addresses to specifically ask, knowing the 
use of anycast and load balancing.  The other assumes that it is an end-user 
view, one that does not have the list of IP addresses to target, does not have 
knowledge of the layout, and is only able to have a black-box (opaque) view of 
the DNS as a service.

For the former, with a well-designed query/response mechanism, scripts can be 
written to verify that configurations are as expected.  Managing anycast 
deployments could use "service addresses" or an appropriately distributed 
monitoring source network.  Load balancers can be handled too.  (Eliding ways 
of doing that.)  And TSIG is an option (for security).

For the latter, which I infer is the ultimate intent of the draft, that is, to 
estimate when a (as opposed to "the") Root Zone KSK rollover (being specific 
here to the root zone on the global public Internet) can proceed.  There's been 
experience here, leading to the notion that the results would give a coarse 
approximation in aggregate, as the way in which DNS recursion is done is quite 
complex.  Given that the goal is to measure the trust anchors configured in 
individual validation engines via testing of units that see the DNS as a black 
box service, a coarse approximation is all we can hope for - that that this is 
bad but we have to begin to evaluate where the returns diminish.

Summary -

The document has a laudable goal.  The first suggestion is to structure it in 
layers.  Clean up the query/response section to describe the footwork of the 
effort, I'd suggest experimenting with the idea of using RD=0, and figuring out 
what a trust anchor store can originate as a response (as the response is not 
originating from the administration authoritative for the name).

When it comes to the use of the footwork, divide this into how a network 
manager could incorporate this into a trust anchor maintenance activity and 
then into a more widespread, third-party, measurement of trust anchor 
deployment.

Other considerations, well beyond the scope of the subject line's document - 
revamp the Automated Updates of DNSSEC Trust Anchors from a validator-side 
mechanism into a two-party protocol, with the goal of increased manageability 
(monitoring, measurement) and, as something not yet raised, more efficient in 
sizes of network responses (e.g., make MISSING an expected state and then 
manage around that).

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to