On 2 July 2018 at 00:03, Paul Hoffman <paul.hoff...@vpnc.org> wrote >8
> Well, RFC 1035 *does* say that it is in ASCII, whether we like that or not. Perhaps we need to remind ourselves what RFC1035 actually *does* say. ASCII is mentioned in three places only: 2.3.3 para 1, initially about case-insensitive character string comparisons, but then wanders off-topic: However, future additions beyond current usage may need to use the full binary octet capabilities in names, so attempts to store domain names in 7-bit ASCII or use of special bytes to terminate labels, etc., should be avoided. 3.1 para 3, supposedly dealing with preferred name syntax, also has another crack at case-insensitive comparisons: Name servers and resolvers must compare labels in a case-insensitive manner (i.e., A=a), assuming ASCII with zero parity. Non-alphabetic codes must match exactly. 6.1.2 para 3, in a section dealing with nameserver internal data structures, yet again gets fixated on character case issues: One way to solve the case problem is to store the labels for each node in two pieces: a standardized-case representation of the label where all ASCII characters are in a single case, together with a bit mask that denotes which characters are actually of a different case. 3.1 para 1, dealing with domain names in DNS messages, which being an external interface, one might expect character encoding to be specified, does not mention ASCII at all. Furthermore, in the whole of section 5, which specifies the format of master files, there is no mention of the character encoding of the files themselves. There is only one other occurrence of "encoding" remotely relevant to the present discussion. 5.1 last para, dealing with special characters and escape sequences: Because these files are text files several special encodings are necessary to allow arbitrary data to be loaded. There is nothing in the subsequent \DDD description to indicate that the decimal number represents an ASCII code point, which it clearly must if labels are ASCII encoded. The other occurrences of "encoding" are in section 8, and deal with mailbox addresses and related matters of no interest here. The proposition that RFC1035 mandates ASCII "presentation format" is demonstrably false. >8 > ... An application could choose to encode the presentation format using a > different encoding, but that's outside the scope of the protocol. Application programs do not "choose the encoding"; that "choice" is inflicted upon them by the I/O system provided by the platform on which they run. If the platform uses EBCDIC, then "presentation format" is EBCDIC encoded, constrained (in this instance) to use the same printable character repertoire as on an ASCII platform. --Dick
_______________________________________________ DNSOP mailing list DNSOP@ietf.org https://www.ietf.org/mailman/listinfo/dnsop