Hi,

I've reviewed the keepalive draft. Generally, i do like the concept a lot 
because:  a) there's certainly a use case that covers real operational issues  
b) it's a space-efficient and c) unobstrusive. However, i think the document 
could improve on clarity in certain aspects - see below for details. I've 
tagged  my feedback with "NIT", "EDITORIAL", "PROTOCOL" to indicate "severity" 
levels of the comments. 

* EDITORIAL: I think the Abstract could be cut down to the first sentence of 
the second paragraph. Everything else should go into (or is already a copy of) 
the Introduction section. Personal taste, i know, but i like short, to the 
point abstracts.

* NIT: API is not expanded on first use

* EDITORIAL: The second paragraph on page 4 lists DNSSEC and crypto-related 
RRTypes as the culprits for the prevalence of truncated responses. However, it 
misses the main point that increased response sizes are the primary problem. So 
changing the text into something "The increasing size of response packets, (for 
example due to deployment of DNSSEC and crypto-related RRTypes)... " would be 
better.

* EDITORIAL: Mention somewhere that the re-use of TCP connections to 
nameservers would even benefit more in case DNS over TLS would be introduced. 
(Sidenote: From the architectural perspective, i think a TLS-DNS spec should 
actually  REQUIRE  a TLS enabled DNS client to support the Keepalive option)

* PROTOCOL: I'm missing a normative and clear definition of how to interpret 
"TIMEOUT" values. The Option format says "a timeout value for the TCP 
connection" (which is way underspecified, given the various timeouts in TCP). 
3.2.1 on the other hand says it's "representative of the minimum expected time 
an individual session should remain established for it to be used..." (Which i 
interpret as the absolute session duration from the time of establishment) - 
other sections of the document make me believe it's the maximum interval 
between two subsequent queries ...

That needs to be fixed, because it will cause interop problems if not clearly 
defined. My proposal would be that the TIMEOUT is the maximum interval a client 
can use that TCP connection since it received the last  DNS message from the 
other end (i don't consider 1/2 RTT here as relevant). 

The document should also clarify the relation between those application level 
keepalive mechanism and TCP level keepalives - i do understand there's no such 
relation - which should be mentioned.

Also, please clarify that TIMEOUT is unsigned...

* PROTOCOL: Are the TIMEOUT values (whatever  time interval they define, see 
above) negotiated for a single session only, or do they affect *all* TCP DNS 
sessions to a specific IP address? Since there is no "session" for the UDP part 
of the negotiation, the client's first assumption would be it's per IP address. 
However, as soon as the TCP sessions are established, the TIMEOUT 
(re-)negotiation could differ for each individual session? A short 
clarification would be good - i think the TIMEOUT value should be independent 
for each individual TCP session.

* EDITORIAL: 3.2.2, second paragraph: The main point is missing in the second 
sentence, i suggest adding " .... MAY keep the existing TCP session open, *up 
to the duration indicated in the TIMEOUT value of the response.*"

* PROTOCOL: Is there any way to signal to a client that it should stop using 
the session as soon as possible, because the server wants to tear it down 
immediately? Since "0" is currently reserved for infinitely long sessions, that 
is not an option. A value of  "1" would allow the client to continue using the 
session for another second, wich is suboptimal (and could impact several 10k's 
of packets on busy servers...) - So, i do suggest re-considering the semantics 
of the "0" (maybe use 65535 as "infinity" and add text that a value of "0" 
should indicate "teardown immediately" - that would be more logical to me?)

* PROTOCOL: Is the expected behaviour (MUST) from both client and server that 
they should add the Option to every single request / response during a 
keepalive session? Please clarify the intended behaviour..

-  If Yes: what is the expected behaviour in case a subsequent packet does not 
include that option? Keep going until TIMEOUT, or assume that the server 
suddenly doesn't want to do keepalive anymore, and revert to "dumb" behaviour? 
- If No: My assumption would be that after the TIMEOUT is inititally 
negotiated, client/server would keep counting, no matter whether messages 
continue the option. Only once the TIMEOUT approaches, a single packet would 
"refresh" the TIMEOUT?

Personally, i tend towards "No", because sending the information in each and 
every message seems redundant to me (updating session timers on each single 
packet).. Feedback appreciated :)


tia,
Alex

_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to