Hi all

I guess you're all aware of the issue of what constitutes a valid domain name, what characters are valid in labels etc. So forgive me for what must be me re-raising an ancient maybe long-thought-put-to-rest issue...

but there's a serious problem out there.

RFC1034 secion 3.5 which is almost copied in RFC1035 section 2.3.1, both labelled "preferred name syntax" clearly define


<domain> ::= <subdomain> | " " <subdomain> ::= <label> | <subdomain> "." <label> <label> ::= <letter> [ [ <ldh-str> ] <let-dig> ] <ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str> <let-dig-hyp> ::= <let-dig> | "-" <let-dig> ::= <letter> | <digit> <letter> ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case <digit> ::= any one of the ten digits 0 through 9 Note that while upper and lower case letters are allowed in domain names, no significance is attached to the case. That is, two names with the same spelling but different case are to be treated as if identical. The labels must follow the rules for ARPANET host names. They must start with a letter, end with a letter or digit, and have as interior characters only letters, digits, and hyphen. There are also some restrictions on the length. Labels must be 63 characters or less. which allows DNS labels (not just host names) to contain alphanumeric and hyphen only. There doesn't seem to be a MUST level requirement to use this, but there doesn't seem to be any specification elsewhere in the documents either.


RFC2818 on the other hand says


The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name. The length of any one label is limited to between 1 and 63 octets. A full domain name is limited to 255 octets (including the separators). The zero length full name is defined as representing the root of the DNS tree, and is typically written and displayed as ".". Those restrictions aside, any binary string whatever can be used as the label of any resource record. Similarly, any binary string can serve as the value of any record that includes a domain name as some or all of its value (SOA, NS, MX, PTR, CNAME, and any others that may be added).


So how did we get from alphanumeric+hyphen to "any binary"?

If we truly allow "any binary" why the need for special ascii-compatible encodings for IDN?

Later RFCs (the ones I checked) seem to corroborate RFC2818, but I'm pretty sure the last time I tried to register a domain I couldn't enter any special chars. So there's a (probably mixed) de facto standard in use anyway.

Plus the countless pages on various answer sites about "what is a valid DNS name" which state alphanumeric+hyphen, and seem to gloss over the underscore used for SRV records.

Is this just a mess that it's been decided we can't really adequately fix?

Thanks

Adrien
_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop

Reply via email to