On 10/12/24 13:47, Theodore Ts'o wrote:
On Tue, Dec 03, 2024 at 09:39:03PM +0100, Gioele Barabucci wrote:
NFC would solve both of these "problems":
* Both U+00E9 (é) and U+0065, U+0301 are NFC-normalized to U+00E9,
* Both U+2126 (Ohm sign) and U+0349 (omega) are NFC-normalized to U+0349
(omega).
What NFC alone will not solve are homograph collisions: a (U+0061 Latin
small letter a) and а (U+0430 Cyrillic small letter a) are NFC-normalized to
different codepoints.
NFC also doesn't solve various invisible characters (e.g., zero-width
spaces, bidirectional control characters). For more information about
all of the various security land mines, see[1].
NFC has been mentioned in a broader discussion on PRECIS/RFC8264/RFC8265.
The IdentifierClass of RFC 8264 explicitly disallows all these "security
land mines": https://www.rfc-editor.org/rfc/rfc8264.html#section-4.2.3
The "Security considerations" section is quite extensive (5 pages long):
https://www.rfc-editor.org/rfc/rfc8264.html#section-12
In general, the PRECIS RFCs are more prescriptive than Unicode UTS #39,
so, should Unicode usernames ever happen, the PRECIS RFCs are the
reference all programs should follow.
Regards,
--
Gioele Barabucci