On Thu, 21 Nov 2024 at 23:26:48 +0100, Iustin Pop wrote: > As Richard also replied, full UTF-8 is tricky, and I think it's somewhat > misplaced to focus on the username, as opposed to gecos. Aren't most > other OSes using the "full name" as the "display name", and the username > is mostly one part of the user/password combination, but not a display > property most of the time? > > So I would suggest that maybe the better option is to standardise the > gecos format/gecos parsing, so migrate UI tools to use that more often.
As a data point, in our default GNOME desktop, System Settings (gnome-control-center) prompts for a "Full Name" first (behind the scenes that's the full name part of the pw_gecos field), and a "Username" second (this is the pw_name); and the default display mode for the gdm3 login prompt is to show a list of full names from pw_gecos. My understanding is that the full name already allows arbitrary UTF-8, except for the characters that can't be represented in passwd(5) syntax (colon, comma, newline) and the ampersand. Outside the Linux/GNU/freedesktop worlds, this is fairly similar to how macOS presents the distinction between the display name and the Unix username (pw_name). macOS is interesting here because it's an operating system with a lot of Unix ancestry, but has also had a lot of effort put into making it friendly for non-technical users. In the macOS world, it seems to be conventional and encouraged to set the username to a lower-case ASCII string with no punctuation, similar to the conventions in POSIX and <https://systemd.io/USER_NAMES/>. Unfortunately I haven't been able to find a reference for what characters macOS allows in pw_name. Perhaps a DD who has a macOS system (or a family member with a macOS system) could help here? I think one good idea that we should certainly adopt from <https://systemd.io/USER_NAMES/> is its separation between "strict mode" (the naming convention that it encourages for all uses, and enforces when a user is created via systemd tools) and "relaxed mode" (the much less strict naming convention that systemd requires for names created by non-systemd tools). Because of the differences between those two modes, systemd is quite conservative in what its own tools will emit but a lot more liberal in what it will accept, and that seems like a good principle here, even if the specific rules that Debian chooses end up differing from those that systemd has chosen. smcv