Hi Corinna, On Wed, 2 Sep 2020 17:24:50 +0200 Corinna Vinschen wrote: > On Sep 2 19:54, Takashi Yano via Cygwin-patches wrote: > > Hi Corinna, > > > > On Wed, 2 Sep 2020 10:38:18 +0200 > > Corinna Vinschen wrote: > > > On Sep 2 10:30, Corinna Vinschen wrote: > > > > Ok guys, I'm not opposed to this change in terms of its result, > > > > but I'm starting to wonder why all this locale code in fhandler_tty > > > > is necessary at all. > > > > > > > > I see that get_langinfo() calls __loadlocale and performs a lot of stuff > > > > on the charsets which looks like duplicates of the initial_setlocale() > > > > call performed at DLL startup. > > > > > > > > If there's anything missing in the initial_setlocale() call which would > > > > be required by the pseudo tty code? What exactly is it? The codepage? > > > > And why can't we just add the info to cygheap->locale at > > > > initial_setlocale() > > > > time so it's available at exec time without going through all this > > > > hassle > > > > every time? > > > > > > > > Apart from that, all this locale/charset/lcid stuff should be > > > > concentrated > > > > in nlsfunc.cc ideally. > > > > > > get_locale_from_env() and get_langinfo() should go away. If we just > > > need a codepage for get_ttyp ()->term_code_page, we should really find a > > > way to do this from within internal_setlocale(). > > > > I looked into internal_setlocale() code, but I could not found > > the code which handles thecode page. I found the code handling > > the code page in __set_charset_from_locale() function in nlsfuncs.cc, > > but it does not return code page itself. Could you please explain > > more detail of your idea? > > I had none yet :) I was just musing, without actually thinking about a > solution. But I think this isn't very complicated. Given this is > inside Cygwin, nothing keeps the function to have a well-defined > side-effect, as in setting a (not yet existing) member "term_code_page" > of cygheap->locale. > > Kind of like this: > > diff --git a/winsup/cygwin/cygheap.h b/winsup/cygwin/cygheap.h > index 8877cc358c39..2b84f4252071 100644 > --- a/winsup/cygwin/cygheap.h > +++ b/winsup/cygwin/cygheap.h > @@ -341,6 +341,7 @@ struct cygheap_debug > struct cygheap_locale > { > mbtowc_p mbtowc; > + UINT term_code_page; > }; > > struct user_heap_info > diff --git a/winsup/cygwin/nlsfuncs.cc b/winsup/cygwin/nlsfuncs.cc > index 668d7eb9e778..752f4239d911 100644 > --- a/winsup/cygwin/nlsfuncs.cc > +++ b/winsup/cygwin/nlsfuncs.cc > @@ -1298,6 +1298,9 @@ __set_charset_from_locale (const char *locale, char > *charset) > LOCALE_IDEFAULTANSICODEPAGE | LOCALE_RETURN_NUMBER, > (PWCHAR) &cp, sizeof cp)) > cp = 0; > + /* Store codepage in cygheap->locale so fhandler_tty can switch the > + pseudo console to the correct codepage. */ > + cygheap->locale.term_code_page = cp ?: CP_UTF8; > /* Translate codepage and lcid to a charset closely aligned with the > default > charsets defined in Glibc. */ > const char *cs; > > Make sense?
I have tried your code, however, it does not work as expected. It seems that __set_charset_from_locale() is not called. cygheap->locale.term_code_page is always 0. I have added following lines into setup_locale() to make sure to call __set_charset_from_locale() for a test, setlocale (LC_ALL, ""); __set_charset_from_locale (__get_global_locale()->categories[LC_CTYPE], charset); get_ttyp ()->term_code_page = cygheap->locale.term_code_page; however, term_code_page is set to 932 if locale is ja_JP.UTF-8. In this case term_code_page should be CP_UTF8 (65001). The code page retrieved in __set_charset_from_locale() is not based on "UTF-8" but "ja_JP". Let me consider a while. -- Takashi Yano <takashi.y...@nifty.ne.jp>