On Mon, 3 May 2021 at 18:23, Ben Schwartz <bem...@google.com> wrote: > > The purpose of this two-layer escaping is to allow key-independent tokenizing > of SvcParam values. For example, I just wrote an implementation of the > parser that works as follows: > > 1. Tokenize > a. Scan forward looking for whitespace or "=". This is the key name. > b. If "=" was found, run the standard char-string parser. Its output is > the SvcParam presentation value. > 2. Create a SvcParam representing (key name, value). > a. Choose a value parser based on the key name. > - Parsers that use the "value-list" pattern call a subroutine with a tiny > chunk of escaping logic for handling backslash and comma. > b. Run the parser on the value.
Thanks for the explanation. The brutal reality is that the char-string parser has already obliterated the distinction between escaped and unescaped commas before the value-list parser is invoked. > > If we use single-layer escaping, this arrangement is not possible. Instead, > (1) we would need to add complexity to the char-string parser that is shared > by many RR types. > (2) we would need to integrate key-type-specific behavior into the tokenizer, > complicating the interface between the parsing function and the SvcParams > implementation. > > In the draft's arrangement, those implementation choices are allowed, but not > required. By publishing test vectors including escaped escapes effectively makes these required, inflicting your questionable design choice on other developers who opted for a different strategy. > > For the administrators, this all seems likely to remain irrelevant. There is > no known use case where escaping is or will be used in a value-list element. > The functionality is defined only to preserve ALPN as 8-bit-clean, as > requested by some TLS experts, but there will likely never be a defined ALPN > that contains these characters. If you truly believe that, then the pragmatic solution is to accept the unfortunate fact that your implementation is limited to value-lists not containing escaped commas. The residual risk that someone, somewhere, will discover a need for an escaped comma is likely to be small. At worst, you will need to revisit your design, at best you need do nothing at all. For the sanity of all concerned, SVCB should adhere to the same standard RFC1035 escape conventions as the other 50+ RRTYPEs. Regards --Dick > > On Sun, May 2, 2021 at 5:27 PM Mark Andrews <ma...@isc.org> wrote: >> >> I agree with you Dick, but some developers complained that they "couldn’t >> re-use their string parsers" (despite no existing parser supporting >> key=“value”) >> so now we have to double escape backslashes. I very much feel that this is >> tail >> wagging the dog. >> >> > On 3 May 2021, at 01:25, Dick Franks <rwfra...@gmail.com> wrote: >> > >> > All, >> > >> > I have considerable difficulty with these test vectors at the end of >> > Appendix D.2: >> > >> > 16 foo.example.org. alpn="f\\\\oo\\,bar,h2" >> > 16 foo.example.org. alpn=f\\\092oo\092,bar,h2 >> > >> > \# 35 ( >> > 00 10 ; priority >> > 03 66 6f 6f 07 65 78 61 6d 70 6c 65 03 6f 72 67 00 ; target >> > 00 01 ; key 1 >> > 00 0c ; param length 12 >> > 08 ; alpn length 8 >> > 66 5c 6f 6f 2c 62 61 72 ; alpn value >> > 02 ; alpn length 2 >> > 68 32 ; alpn value >> > ) >> > >> > which appear to be incompatible with RFC1035 5.1 paragraph 10: >> > >> > Because these files are text files several special encodings are >> > necessary to allow arbitrary data to be loaded. In particular: >> > >> > ... >> > >> > \X where X is any character other than a digit (0-9), is >> > used to quote that character so that its special meaning >> > does not apply. For example, "\." can be used to place >> > a dot character in a label. >> > >> > \DDD where each D is a digit is the octet corresponding to >> > the decimal number described by DDD. The resulting >> > octet is assumed to be text and is not checked for >> > special meaning. >> > >> > The intention appears to be to include (a) a single arbitrary octet in >> > the argument, and (b) a plain text comma not being a delimiter in the >> > argument list. The specimen result is consistent with that assumption. >> > >> > Armed with the weapons supplied by RFC1035, the obvious way to >> > represent such an argument is: alpn="f\092oo\,bar,h2" >> > >> > >> > A parser adhering strictly to RFC1035 zone file escape conventions: >> > >> > #!/usr/bin/perl >> > use Net::DNS 1.31; >> > use Net::DNS::ZoneFile; >> > >> > my $zonefile = new Net::DNS::ZoneFile(\*DATA); >> > while ( my $rr = $zonefile->read ) { >> > $rr->print; >> > } >> > exit; >> > >> > __DATA__ >> > rfc1035-compliant.example. SVCB 16 foo.example.org. >> > alpn="f\092oo\,bar,h2" >> > >> > produces the desired wire-format image: >> > >> > rfc1035-compliant.example. IN SVCB ( \# 35 0010 ; 16 >> > 03666f6f076578616d706c65036f7267 00 ; >> > foo.example.org. >> > 0001 000c 08665c6f6f2c626172026832 ) >> > >> > Other parsers are available. >> > >> > >> > The test vectors, as written, appear to rely upon somehow reactivating >> > the special meaning of the escape character which is explicitly >> > disallowed by RFC1035. >> > >> > The result in each case is: >> > >> > non-compliant.example. IN SVCB ( \# 37 0010 ; 16 >> > 03666f6f076578616d706c65036f7267 00 ; >> > foo.example.org. >> > 0001 000e 06665c5c6f6f5c03626172026832 ) >> > >> > the escaped escape characters being inserted as uninterpreted text per >> > RFC1035. >> > >> > >> > Dick Franks >> > ________________________ >> > >> > _______________________________________________ >> > DNSOP mailing list >> > DNSOP@ietf.org >> > https://www.ietf.org/mailman/listinfo/dnsop >> >> -- >> Mark Andrews, ISC >> 1 Seymour St., Dundas Valley, NSW 2117, Australia >> PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org >> _______________________________________________ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop