Sandeep,

On Sat, Mar 30, 2024 at 10:21:30AM +0800, Sadeep Madurange wrote:
> On 2024-03-29 16:47:22, Derek Martin wrote:
> > [*] Except in extremely rare and completely esoteric cases that apply
> >     only to experts... and by now should really apply to no one.
> 
> Thank you for patiently explaining. That was very educational. This is
> what I used to do on Linux in the past, though, without knowing why.
> 
> Unfortunately, this doesn't seem to work on OpenBSD. So, perhaps this
> qualifies as one of the esoteric cases. OpenBSD doesn't seem to pay much
> attention to the LANG variable. Thw following is an excerpt from the
> locale man page:

I don't have an OpenBSD system to test on, so I can't investigate
further, but I wouldn't think so, because:

> "Programs in the OpenBSD base system ignore the locale except for the
> character encoding..."

Mutt is not part of the "base system" so the limitation on locale
should not apply to it, and "except for the character encoding"
absolutely SHOULD apply, since that is specifically what is at issue
in your case.

Reading the rest of the man page, I find some of the additional
details should be applicable to your case as well:

    "If the value of LC_CTYPE ends in ‘.UTF-8’, programs in the
    OpenBSD base system ignore the beginning of it, treating for
    example zh_CN.UTF-8 exactly like en_US.UTF-8. Programs from
    packages(7) may however make a difference."

Shocking antiquation aside, this certainly seems to suggest that the
guts of what is necessary for Mutt to have what it needs for the
locale to work correctly do exist on the system.  And the behavior
they describe isn't really all that different from what normally
happens on modern systems with full locale support--the point of
Unicode is that every language's character sets are available
simultaneously, so from a charset perspective the actual language
doesn't matter.  The main difference here appears to be that their
base system only has English translations, so it ignores other aspects
of LANG.

And then this:

    "LANG   Fallback if any of the above is unset."

This matches the behavior I explained already in my previous post...
At least according to their man page, everything I described SHOULD
work.  In other words, according to their own man page, if you set
LANG and leave the other variables unset, this should be exactly
equivalent to setting LC_CTYPE for any library functions which honor
that variable, i.e. ALL of the functions which Mutt uses to control
that behavior.  

Separately, though not apparently related to any concerns you have,
Mutt's translations are provided by Mutt--not the base system--so even
that should work correctly if your locale is set properly.  That said,
setting only LC_CTYPE is fine, if you only care about using Unicode as
your character set, and don't want the other behaviors related to e.g.
date & time, money display, etc. to match whatever differences exist
in a UTF-8 environment (in my experience with en_US.UTF-8 the only
difference is default sort order, typically observable in the output
of the ls command, and controlled specifically by LC_COLLATE).

However I would point out that your own post said that you had left
LANG unset.  Did you try setting LANG (and unsetting all the other
environment variables, and Mutt's charset variable) as I suggested?  
Did you then also look at the output of the locale command to ensure
that the settings were correct, as expected based on that setting?
I'd love to see that output, to confirm or refute whether your system
is correctly honoring LANG, as its man page seems to say it should...
And then once you confirmed that locale was correct, did you then try
Mutt?

In either case--whether you set only LANG or only LC_CTYPE--you should
not then need to set Mutt's character set, because it should get it
from LC_CTYPE (directly or indirectly through inheritance from LANG).

All of this behavior that I've described is part of the POSIX
standard, and has been for quite literally decades.  If OpenBSD can't
handle this, then perhaps that would make at least part of an argument
for why end users shouldn't use it as their desktop OS...

-- 
Derek D. Martin    http://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail due to spam prevention.  Sorry for the inconvenience.

Attachment: signature.asc
Description: PGP signature

Reply via email to