Hi,

below is an unofficial I18NDIR review of the draft,
performed by John Klensin (forwarded here with his permission).
Thanks to John for doing this review.

Regards,
Valery.

-----Original Message-----
From: John C Klensin [mailto:john-i...@jck.com] 
Sent: Monday, December 26, 2022 9:53 PM
To: Valery Smyslov
Cc: uta-cha...@ietf.org;        ; art-...@ietf.org; Barry Leiba
Subject: Re: [I18ndir] I18NDIR active?

 (1) Given the importance of anyone intending to use a
certificate being absolutely certain that the certificate that
should apply is actually the certificate in hand, it seems to me
desirable that Section 4.1 more carefully examine anything
identified as "SHOULD" and comment on the circumstances in which
following that rule would be inappropriate and/or the possible
consequences of not applying it.

(2) The document makes several references to URIs, but only RFC
3986 appears to be referenced.  In the real world in which
certificates are established and used and in which differences
in specifications and practices often provide opportunities for
exploitation by would-be evildoers, there are at least two,
probably three, URI specifications (IETF/RFC3896, WHATWG, and
maybe W3C).  Each is treated as authoritative by some Internet
actors and they are not consistent with each other.  That
situation and its implications should be pointed out, at least
as a Security Consideration.

(3) Similarly, there are, in practice, at least two different
specifications for IDNs.  While the IETF considers it obsolete,
IDNA2003 is still referenced periodically and might constitute
another.  One of those, obviously, is as specified by RFC
5890ff.  Another is specified, with the claim that it is a
transition strategy but that has shown no signs in recent years
of being used that way rather than as an alternate spec, by the
Unicode Consortium as UTS#46 [4].  These specifications, and
other local deviations and (claimed) translation strategies, are
in wide use in different communities.  In particular, while
ICANN --and hence what is nominally permitted to be registered
in the DNS near the root of tree-- at least nominally conforms
to IDNA2008 (but has been unable to prevent some TLDs from
registering emoji as second-level domains), WHATWG (and hence
most or all browser vendors and implementers) have specification
written in terms of UTS#46. The same considerations as in (2)
above apply, only the incompatibilities among the specs are much
greater with emoji in domain names being a striking difference
although there are many more subtle cases.  And, as in (2) this
issue should at least be a Security Consideration.

(4) Section 6.3 strongly implies that there are two types of
domain names: the traditional, all-ASCII, variety known as the
"preferred name syntax"  in RFC 1034 Section 3.5 and IDNs.  But
it does not say that but, instead, points to RFC 1034 as a
whole.  RFC 1034 imposes no restrictions on what can be in a the
octets that make up a label (see, e.g., Section 11 of RFC 2181).
If you mean that the labels of the domain names you are
considering must be IDNs, all-ASCII, or "preferred syntax" (the
last two are different), figure out a way to say that explicitly.

(5) AFAICT, the third paragraph of Section 6.3, describing IDN
matching, is correct.  You might want to say "before checking
the domain name or comparing it with others" but that is a nit.

(6) Under 6.4, the draft says "The iPAddress field does not
include the IP version, so IPv4 addresses are distinguish from
IPv6 addresses only by their length (4 as opposed to 16 bytes)".
I don't know if that can be fixed or if this document would be
the appropriate place to fix it, but, as long is there are
people, companies, governments, or other entities out there with
bright ideas about IPvN, N > 6, out there, basing operations on
this field conditional on a heuristic that depends on length
does not seem like a good idea.  And, btw, the preferred term is
usually "octets" rather than "bytes".

(7) Editorial nit: the first sentence of the penultimate
paragraph of 6.4 isn't one.

(8)  If all of your processing (not just comparisons) and what
you allow to store in certificates is based on A-labels, then
I'm not sure what Section 7.2 means.  If you allow unrestricted
Unicode strings, or even U-labels, in labels in certificates,
then visual confusion by users is only one of many problems you
invite.  And, even then, note, e.g., that 
    trøll ( \u0074\u0072\u00F8\u006C\u006C )
and the identical-appearing 
    \u0074\u0072\u006F\u0078\u006C\u006C
generate different A-labels and are not a "visual confusion"
problem as described in the cited portions of RFC 5890.  UTR#36
(which you reference) and  UTS#39 [5], especially Section 4
(which the document does not reference and probably should) are
better in some ways, but basically point to the problems and
possible approaches.  In particular, implementing even the
"whole script" algorithm of UTS#39 Section 4.1 (and then 5.1)
require fairly deep understanding of whatever scripts might
appear in the characters of any particular DNS label.   That
does not quite rise to "impossible" but is certainly well in the
"infeasible" range, even for single-script labels, when all
possible IDN labels are considered.

Again, the above is based on a very superficial reading of the
document.  It is not an official review, is not a substitute for
one, and is almost certainly not complete.

    john


_______________________________________________
Uta mailing list
Uta@ietf.org
https://www.ietf.org/mailman/listinfo/uta

Reply via email to