Hi, thank you all for your contributions to this discussion. I have now finally understood¹ that it is not enough to try creating an UTF-8 encoded user name and see that it correctly shows up in /etc/passwd to declare UTF-8 support. Please forgive me for not replying to all of you in this thread individually, I have read everything and if I didnt cater for your arguments in this message please feel free to remind me.
https://lists.debian.org/debian-devel/2024/11/msg00491.html correctly outlines that homograph characters (such as é (UTF-8 0xC3 0xA9 and the lookalike é 0x65 0xCC 0x81) are not only a nuisance. At the least, adduser should reject creating étienne if étienne already exists - those are different user names but look the same, and if you don't cut-and-paste user names instead of typing them you're bound to hit the wrong user depending on HOW you type and what input medium you use. Not good. https://wiki.debian.org/UserAccounts and https://wiki.debian.org/UserAccountsPhilosophy are updated accordingly. After understanding this, I must admit that what's currently left active on the adduser team (me) doesn't have the capacity to implement this properly and in time for trixie. To make things worse, the Unicode::Precis module, which should be in Debian as libunicode-precis-perl (but isn't) hasnt seen an upstream release in more than five years. Additionally, I don't see myself in the situation of writing a proper checker for the RFC 8264 IdentifierClass (Chapter 4.2) at the moment since I don't have the time to check out which \p{Foo} character classes match the classes given in the RFC. I would appreciate volunteers to help here, but first I need to bring some sense in adduser's current state of affairs to make an unstable upload that can eventuall migrate to testing. What I intend to do in adduser for the next unstable upload is: - adduser --system's user name validation will not change - I'll make sure that adduser <normal user account> doesn't accept UTF-8 user names, bringing it closer to systemd's notion of a valid user name - adduser --allow-bad-names will still allow UTF-8 usernames, not doing normalization. I will document this and make it clear that the local admin needs to make sure that they don't allow things they don't want to have - adduser --allow-all-names will just verbatim pass all user names to useradd. All this will be documented in the man page, in README.Debian and/or the Wiki after the code passes the test suite again. I'll probably deprecate --allow-bad-names in favor of something that doesn't use the word "bad" (suggestions appreciated). Otoh, adduser in the Red Hat World uses --badname to allow such names as well. I would love to hear your opinion. Silence is agreement ;-) Greetings Marc ¹ RFC 8264, RFC 8265, and Unicode TR 15 linked in this thread were educating for me -- ----------------------------------------------------------------------------- Marc Haber | "I don't trust Computers. They | Mailadresse im Header Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402 Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421