[Reducing the list to debian-devel. I have omitted to set Reply-To and
apologize for that]

On Thu, Nov 21, 2024 at 11:26:48PM +0100, Iustin Pop wrote:
> On 2024-11-21 18:45:06, Marc Haber wrote:
> > Should Debian allow UTF-8 user names in the first place or should we
> > restrict names for regular users to some us-ascii near set as well? (I
> > think yes, we should)
> 
> You weren't clear to which part you agreed. If by "we should" you meant
> the closest option, i.e. restrict, then I agree as well.

I am sorry. My personal opinions were among the last things I added to
the article and I was not clear here. I think we should allow UTF-8 user
names as a courtesy to those people who need non-ascii user names to
write their name, since user names are frequently chosen from the real
name of the person. In addition, this will enhance software quality
since we now get the chance of finding bugs that are already here in
many software.

This comes kind of late in the Trixie cycle, but as it is currently
already possible to create user names with UTF-8 characters, I do not
like the idea of tightening our restrictions in Trixie over what we have
in Bookworm just to maybe revisit our decision in Trixie+1.

> As Richard also replied, full UTF-8 is tricky,

My current code uses \p{Graph} as a least common denominator. I am not
sure whether this is wise.

> and I think it's somewhat
> misplaced to focus on the username, as opposed to gecos. Aren't most
> other OSes using the "full name" as the "display name", and the username
> is mostly one part of the user/password combination, but not a display
> property most of the time?

I think that we should allow full UTF-8 in the gecos¹ field, yes. People
should be allowed to have their fully correct name in there. I also
think that users of non-latin languages should have the possibility to
have a login name that resembles their name.

¹ in 2024 noone remembers what gecos means any more. Adduser and
src:shadow are using "comment" for that field nowadays.

> So I would suggest that maybe the better option is to standardise the
> gecos format/gecos parsing, so migrate UI tools to use that more often.

That doesn't solve the issue I am having with adduser right now: That
we're allowing things that we are not sure we should allow.

> On the other hand, as long as this is admin-controlled, it doesn't
> matter much. I could see that viewpoint, but I wonder how much latent
> breakage would be introduced that will take years to fix in all tooling
> and all packages.

Yes. Fixing breakage makes software better, and by disallowing non-latin
characters in user names we are hiding those issues away.

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

Reply via email to