Hi nick (and Marc),

At 2024-12-01T18:43:28-0500, nick black wrote:
> Gioele Barabucci left as an exercise for the reader:
> > You may have misunderstood that phrase. I was not referring to the
> > fact that there are no standardized normalization forms for Unicode
> > (I explicitly mention Annex 15 in [1]), but to the fact that there
> > is no standard that specifies which of the possible normalization
> > forms should be used for account names (and other fields in passwd).
> > POSIX explicitly limits itself of a subset of ASCII, so it is not
> > going to mandate any normalization form. Are there other standards
> > (or initiatives) in this area that you know of?
> 
> I'm glad we're both on page for Annex 15, and indeed, POSIX does seem
> to explicitly exclude any work in this area. Assuming we're willing to
> go beyond POSIX (and again, this seems something where we'd want to
> loop in other distributions, and probably kernel developers), I'm
> honestly not sure which of the Annex 15 canonicalizations we'd want to
> use -- I'd like to hear from experts (or at least people with
> extensive experience outside of US-ASCII) as to which method is best.
> I have no dog in that hunt, save that everyone agrees on a method.
> 
> It's for this reason that I think any work in this area needs be
> encapsulated in a common library.

It sounds like you want something isomorphic, if not identical, to,
Punycode.

https://en.wikipedia.org/wiki/Punycode

...for which libraries exist, as I understand it.

These things are ugly, which is why I suppose they haven't caught on
despite being around for decades, but I would guess that this problem
space is such that there are no non-ugly solutions apart from "just
stick to ASCII", which some people find ugly in a different way.

Apologies if I missed someone bringing up and rejecting Punycode in the
previous ~41 messages in this thread.  I rescanned, using my fallible
human eyeballs.  It would be helpful to me if lists.debian.org supported
a search feature.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature

Reply via email to