Hi,

I'm running Cygwin 2.2.0 on an English Windows 8.1 box:

> CYGWIN_NT-6.3 UNIT-725 2.2.0(0.289/5/3) 2015-08-03 12:51 x86_64 Cygwin

Windows regional settings are set to Russian/Russia.

In the absence of any settings in bashrc/bash_profile, `locale` command
outputs the following:

> LANG=ru_RU
> LC_CTYPE="ru_RU"
> LC_NUMERIC="ru_RU"
> LC_TIME="ru_RU"
> LC_COLLATE="ru_RU"
> LC_MONETARY="ru_RU"
> LC_MESSAGES="ru_RU"
> LC_ALL=

This is perfectly fine, except that "no charset" in the locale output
means "ISO charset", which is ISO-8859-5 for Russian/Russia and has
never been used (historically, DOS used CP866, Windows used CP1251 ANSI
codepage, and various Unices sticked to KOI8-R before the rise of
Unicode era).

The above is consistent with locale charmap output, which is again
ISO-8859-5.


Short C example also confirms ISO-8859-5 is used:

> #include <stdio.h>
> 
> #include <locale.h>
> #include <langinfo.h>
> 
> int main() {
>     const char *locale = setlocale(LC_ALL, "");
>     const char *codeset = nl_langinfo(CODESET);
>     printf("locale: %s\n", locale);
>     printf("codeset: %s\n", codeset);
> 
>     return 0;
> }

outputs

> locale: ru_RU/ru_RU/ru_RU/ru_RU/ru_RU/C
> codeset: ISO-8859-5


Cygwin docs state that

> Starting with Cygwin 1.7.2, the default character set is determined by the 
> default Windows ANSI codepage for this language and territory.

which is not true in my case (Windows ANSI codepage for Cyrillic is
CP1251, not ISO-8859-5!). Surprisingly, for Belarusian (a.k.a
Belorussian, Eastern Slavic language very close to Russian) "be_BY"
locale the default charset is indeed CP1251 which is in accordance with
both the documentation and common sense.


Additionally, in `strace locale -u` output, I see multiple
> __get_lcid_from_locale: LCID=0x0419 
lines.

"0x0419" corresponds to Russian/Russia (see
<https://msdn.microsoft.com/en-us/library/windows/desktop/dd318693%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396>).

Despite that, $(locale -u) returns "en_GB", despite all regional
settings are set to Russian/Russia. I believe this is not correct,
either, and needs to be fixed.


Regards,
Andrey.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to