Hi all
I guess you're all aware of the issue of what constitutes a valid domain
name, what characters are valid in labels etc. So forgive me for what
must be me re-raising an ancient maybe long-thought-put-to-rest issue...
but there's a serious problem out there.
RFC1034 secion 3.5 which is almost copied in RFC1035 section 2.3.1, both
labelled "preferred name syntax" clearly define
<domain> ::= <subdomain> | " " <subdomain> ::= <label> | <subdomain> "."
<label> <label> ::= <letter> [ [ <ldh-str> ] <let-dig> ] <ldh-str> ::=
<let-dig-hyp> | <let-dig-hyp> <ldh-str> <let-dig-hyp> ::= <let-dig> |
"-" <let-dig> ::= <letter> | <digit> <letter> ::= any one of the 52
alphabetic characters A through Z in upper case and a through z in lower
case <digit> ::= any one of the ten digits 0 through 9 Note that while
upper and lower case letters are allowed in domain names, no
significance is attached to the case. That is, two names with the same
spelling but different case are to be treated as if identical. The
labels must follow the rules for ARPANET host names. They must start
with a letter, end with a letter or digit, and have as interior
characters only letters, digits, and hyphen. There are also some
restrictions on the length. Labels must be 63 characters or less.
which allows DNS labels (not just host names) to contain alphanumeric
and hyphen only. There doesn't seem to be a MUST level requirement to
use this, but there doesn't seem to be any specification elsewhere in
the documents either.
RFC2818 on the other hand says
The DNS itself places only one restriction on the particular labels that
can be used to identify resource records. That one restriction relates
to the length of the label and the full name. The length of any one
label is limited to between 1 and 63 octets. A full domain name is
limited to 255 octets (including the separators). The zero length full
name is defined as representing the root of the DNS tree, and is
typically written and displayed as ".". Those restrictions aside, any
binary string whatever can be used as the label of any resource record.
Similarly, any binary string can serve as the value of any record that
includes a domain name as some or all of its value (SOA, NS, MX, PTR,
CNAME, and any others that may be added).
So how did we get from alphanumeric+hyphen to "any binary"?
If we truly allow "any binary" why the need for special ascii-compatible
encodings for IDN?
Later RFCs (the ones I checked) seem to corroborate RFC2818, but I'm
pretty sure the last time I tried to register a domain I couldn't enter
any special chars. So there's a (probably mixed) de facto standard in
use anyway.
Plus the countless pages on various answer sites about "what is a valid
DNS name" which state alphanumeric+hyphen, and seem to gloss over the
underscore used for SRV records.
Is this just a mess that it's been decided we can't really adequately
fix?
Thanks
Adrien
_______________________________________________
DNSOP mailing list
DNSOP@ietf.org
https://www.ietf.org/mailman/listinfo/dnsop