Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Corinna Vinschen
On Sep  1 18:19, Johannes Schindelin wrote:
> When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> correct, but after that (at least if Pseudo Console support is enabled),
> we try to find the default code page for that `LCID`, which is ASCII
> (437). Subsequently, we set the Console output code page to that value,
> completely ignoring that we wanted to use UTF-8.
> 
> Let's not ignore the specifically asked-for UTF-8 character set.
> 
> While at it, let's also set the Console output code page even if Pseudo
> Console support is disabled; contrary to the behavior of v3.0.7, the
> Console output code page is not ignored in that case.
> 
> The most common symptom would be that console applications which do not
> specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
> 
> This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> https://github.com/msys2/MSYS2-packages/issues/2012,
> https://github.com/rust-lang/cargo/issues/8369,
> https://github.com/git-for-windows/git/issues/2734,
> https://github.com/git-for-windows/git/issues/2793,
> https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> few others.
> 
> Signed-off-by: Johannes Schindelin 
> ---
>  winsup/cygwin/fhandler_tty.cc | 9 +
>  1 file changed, 9 insertions(+)

Ok guys, I'm not opposed to this change in terms of its result,
but I'm starting to wonder why all this locale code in fhandler_tty
is necessary at all.

I see that get_langinfo() calls __loadlocale and performs a lot of stuff
on the charsets which looks like duplicates of the initial_setlocale()
call performed at DLL startup.

If there's anything missing in the initial_setlocale() call which would
be required by the pseudo tty code?  What exactly is it?  The codepage?
And why can't we just add the info to cygheap->locale at initial_setlocale()
time so it's available at exec time without going through all this hassle
every time?

Apart from that, all this locale/charset/lcid stuff should be concentrated
in nlsfunc.cc ideally.


Thanks,
Corinna


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Corinna Vinschen
On Sep  2 10:30, Corinna Vinschen wrote:
> On Sep  1 18:19, Johannes Schindelin wrote:
> > When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> > correct, but after that (at least if Pseudo Console support is enabled),
> > we try to find the default code page for that `LCID`, which is ASCII
> > (437). Subsequently, we set the Console output code page to that value,
> > completely ignoring that we wanted to use UTF-8.
> > 
> > Let's not ignore the specifically asked-for UTF-8 character set.
> > 
> > While at it, let's also set the Console output code page even if Pseudo
> > Console support is disabled; contrary to the behavior of v3.0.7, the
> > Console output code page is not ignored in that case.
> > 
> > The most common symptom would be that console applications which do not
> > specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> > seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
> > 
> > This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> > https://github.com/msys2/MSYS2-packages/issues/2012,
> > https://github.com/rust-lang/cargo/issues/8369,
> > https://github.com/git-for-windows/git/issues/2734,
> > https://github.com/git-for-windows/git/issues/2793,
> > https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> > few others.
> > 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  winsup/cygwin/fhandler_tty.cc | 9 +
> >  1 file changed, 9 insertions(+)
> 
> Ok guys, I'm not opposed to this change in terms of its result,
> but I'm starting to wonder why all this locale code in fhandler_tty
> is necessary at all.
> 
> I see that get_langinfo() calls __loadlocale and performs a lot of stuff
> on the charsets which looks like duplicates of the initial_setlocale()
> call performed at DLL startup.
> 
> If there's anything missing in the initial_setlocale() call which would
> be required by the pseudo tty code?  What exactly is it?  The codepage?
> And why can't we just add the info to cygheap->locale at initial_setlocale()
> time so it's available at exec time without going through all this hassle
> every time?
> 
> Apart from that, all this locale/charset/lcid stuff should be concentrated
> in nlsfunc.cc ideally.

get_locale_from_env() and get_langinfo() should go away.  If we just
need a codepage for get_ttyp ()->term_code_page, we should really find a
way to do this from within internal_setlocale().


Corinna


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Takashi Yano via Cygwin-patches
On Wed, 2 Sep 2020 10:30:14 +0200
Corinna Vinschen wrote:
> On Sep  1 18:19, Johannes Schindelin wrote:
> > When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> > correct, but after that (at least if Pseudo Console support is enabled),
> > we try to find the default code page for that `LCID`, which is ASCII
> > (437). Subsequently, we set the Console output code page to that value,
> > completely ignoring that we wanted to use UTF-8.
> > 
> > Let's not ignore the specifically asked-for UTF-8 character set.
> > 
> > While at it, let's also set the Console output code page even if Pseudo
> > Console support is disabled; contrary to the behavior of v3.0.7, the
> > Console output code page is not ignored in that case.
> > 
> > The most common symptom would be that console applications which do not
> > specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> > seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
> > 
> > This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> > https://github.com/msys2/MSYS2-packages/issues/2012,
> > https://github.com/rust-lang/cargo/issues/8369,
> > https://github.com/git-for-windows/git/issues/2734,
> > https://github.com/git-for-windows/git/issues/2793,
> > https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> > few others.
> > 
> > Signed-off-by: Johannes Schindelin 
> > ---
> >  winsup/cygwin/fhandler_tty.cc | 9 +
> >  1 file changed, 9 insertions(+)
> 
> Ok guys, I'm not opposed to this change in terms of its result,

I am sorry, but I cannot agree with Johannes's patch.

For example, code page in Japan is CP932 by default.
In this case, cmd.exe, netsh.exe and so on are generate
messages in Japanese.

If the code page is set to CP_UTF8, messages from such
commands changes to english. I guess similar things happen
for other locales.

I do not prefer this result.

Furthermore, I looked into the issue:
https://github.com/git-for-windows/git/issues/2734
and I found that git-for-windows always use utf-8
encoding even if the locale is ja_JP.CP932.
It does not change coding based on locale or code
page.

Even with Johannes's patch, if mintty is started with
locale ja_JP.CP932, the file name will be garbled
bacause SetConsoleOutputCP(CP_UTF8) will not be called.

IMHO, it is the problem of git-for-windows rather
than cygwin and msys2.

To make current version of git-for-windows work, it is
necessary to set code page to CP_UTF8 regardless locale.
This does not make sense at all.

-- 
Takashi Yano 


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Takashi Yano via Cygwin-patches
Hi Corinna,

On Wed, 2 Sep 2020 10:38:18 +0200
Corinna Vinschen wrote:
> On Sep  2 10:30, Corinna Vinschen wrote:
> > Ok guys, I'm not opposed to this change in terms of its result,
> > but I'm starting to wonder why all this locale code in fhandler_tty
> > is necessary at all.
> > 
> > I see that get_langinfo() calls __loadlocale and performs a lot of stuff
> > on the charsets which looks like duplicates of the initial_setlocale()
> > call performed at DLL startup.
> > 
> > If there's anything missing in the initial_setlocale() call which would
> > be required by the pseudo tty code?  What exactly is it?  The codepage?
> > And why can't we just add the info to cygheap->locale at initial_setlocale()
> > time so it's available at exec time without going through all this hassle
> > every time?
> > 
> > Apart from that, all this locale/charset/lcid stuff should be concentrated
> > in nlsfunc.cc ideally.
> 
> get_locale_from_env() and get_langinfo() should go away.  If we just
> need a codepage for get_ttyp ()->term_code_page, we should really find a
> way to do this from within internal_setlocale().

I looked into internal_setlocale() code, but I could not found
the code which handles thecode page. I found the code handling
the code page in __set_charset_from_locale() function in nlsfuncs.cc,
but it does not return code page itself. Could you please explain
more detail of your idea?

-- 
Takashi Yano 


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Johannes Schindelin
Hi,

On Tue, 1 Sep 2020, Johannes Schindelin wrote:

> When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> correct, but after that (at least if Pseudo Console support is enabled),
> we try to find the default code page for that `LCID`, which is ASCII
> (437). Subsequently, we set the Console output code page to that value,
> completely ignoring that we wanted to use UTF-8.
>
> Let's not ignore the specifically asked-for UTF-8 character set.
>
> While at it, let's also set the Console output code page even if Pseudo
> Console support is disabled; contrary to the behavior of v3.0.7, the
> Console output code page is not ignored in that case.
>
> The most common symptom would be that console applications which do not
> specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
>
> This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> https://github.com/msys2/MSYS2-packages/issues/2012,
> https://github.com/rust-lang/cargo/issues/8369,
> https://github.com/git-for-windows/git/issues/2734,
> https://github.com/git-for-windows/git/issues/2793,
> https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> few others.
>
> Signed-off-by: Johannes Schindelin 
> ---
>  winsup/cygwin/fhandler_tty.cc | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/winsup/cygwin/fhandler_tty.cc b/winsup/cygwin/fhandler_tty.cc
> index 06789a500..414c26992 100644
> --- a/winsup/cygwin/fhandler_tty.cc
> +++ b/winsup/cygwin/fhandler_tty.cc
> @@ -2859,6 +2859,15 @@ fhandler_pty_slave::setup_locale (void)
>char charset[ENCODING_LEN + 1] = "ASCII";
>LCID lcid = get_langinfo (locale, charset);
>
> +  /* Special-case the UTF-8 character set */
> +  if (strcasecmp (charset, "UTF-8") == 0)
> +{
> +  get_ttyp ()->term_code_page = CP_UTF8;
> +  SetConsoleCP (CP_UTF8);
> +  SetConsoleOutputCP (CP_UTF8);
> +  return;
> +}
> +

Just a word of warning: while this patch can be ported to a634adda5
(libm/machine/arm: Rename s*_fma.c -> s*_fma_arm.c, 2020-09-01) relatively
easily (and the first two patches of this patch series cannot, as they are
no longer applicable after the complete redesign of the Pseudo Console
support), it only works as intended in the `disable_pcon` mode.

The new design calls for Pseudo Consoles to be created per spawned console
application.

And I have not found any way to convince my local version of the runtime
to change the code page of these Pseudo Consoles away from the rather
unfortunate default 437.

This is a problem.

Take for example https://github.com/git-for-windows/git/issues/2793.
Telling the users that they should patch node.js and recompile is probably
not going to fly.

Hopefully there is a way to fix this, otherwise Pseudo Console support
will continue to be quite the support burden.

Ciao,
Johannes

>/* Set console code page from locale */
>if (get_pseudo_console ())
>  {
> --
> 2.27.0
>
>


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Johannes Schindelin
Hi Takashi,

On Wed, 2 Sep 2020, Takashi Yano via Cygwin-patches wrote:

> On Wed, 2 Sep 2020 10:30:14 +0200
> Corinna Vinschen wrote:
> > On Sep  1 18:19, Johannes Schindelin wrote:
> > > When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> > > correct, but after that (at least if Pseudo Console support is enabled),
> > > we try to find the default code page for that `LCID`, which is ASCII
> > > (437). Subsequently, we set the Console output code page to that value,
> > > completely ignoring that we wanted to use UTF-8.
> > >
> > > Let's not ignore the specifically asked-for UTF-8 character set.
> > >
> > > While at it, let's also set the Console output code page even if Pseudo
> > > Console support is disabled; contrary to the behavior of v3.0.7, the
> > > Console output code page is not ignored in that case.
> > >
> > > The most common symptom would be that console applications which do not
> > > specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> > > seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
> > >
> > > This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> > > https://github.com/msys2/MSYS2-packages/issues/2012,
> > > https://github.com/rust-lang/cargo/issues/8369,
> > > https://github.com/git-for-windows/git/issues/2734,
> > > https://github.com/git-for-windows/git/issues/2793,
> > > https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> > > few others.
> > >
> > > Signed-off-by: Johannes Schindelin 
> > > ---
> > >  winsup/cygwin/fhandler_tty.cc | 9 +
> > >  1 file changed, 9 insertions(+)
> >
> > Ok guys, I'm not opposed to this change in terms of its result,
>
> I am sorry, but I cannot agree with Johannes's patch.
>
> For example, code page in Japan is CP932 by default.
> In this case, cmd.exe, netsh.exe and so on are generate
> messages in Japanese.
>
> If the code page is set to CP_UTF8, messages from such
> commands changes to english. I guess similar things happen
> for other locales.
>
> I do not prefer this result.
>
> Furthermore, I looked into the issue:
> https://github.com/git-for-windows/git/issues/2734
> and I found that git-for-windows always use utf-8
> encoding even if the locale is ja_JP.CP932.
> It does not change coding based on locale or code
> page.
>
> Even with Johannes's patch, if mintty is started with
> locale ja_JP.CP932, the file name will be garbled
> bacause SetConsoleOutputCP(CP_UTF8) will not be called.
>
> IMHO, it is the problem of git-for-windows rather
> than cygwin and msys2.
>
> To make current version of git-for-windows work, it is
> necessary to set code page to CP_UTF8 regardless locale.
> This does not make sense at all.

You are misrepresenting the problem. It has nothing to do with Git for
Windows. For example, if you run tests in an Angular project inside
Cygwin's MinTTY, the output will be garbled because node.js (or the
Angular libraries, I don't know) will interpret `LANG` or something.

This is a much bigger problem than you make it sound. The console
applications that do _not_ call `SetConsoleOutputCP()` are so
ubiquituous. I think you are underestimating that problem rather
dramatically.

Ciao,
Johannes


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Takashi Yano via Cygwin-patches
On Wed, 2 Sep 2020 08:26:04 +0200 (CEST)
Johannes Schindelin wrote:
> Hi Takashi,
> 
> On Wed, 2 Sep 2020, Takashi Yano via Cygwin-patches wrote:
> 
> > On Wed, 2 Sep 2020 10:30:14 +0200
> > Corinna Vinschen wrote:
> > > On Sep  1 18:19, Johannes Schindelin wrote:
> > > > When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> > > > correct, but after that (at least if Pseudo Console support is enabled),
> > > > we try to find the default code page for that `LCID`, which is ASCII
> > > > (437). Subsequently, we set the Console output code page to that value,
> > > > completely ignoring that we wanted to use UTF-8.
> > > >
> > > > Let's not ignore the specifically asked-for UTF-8 character set.
> > > >
> > > > While at it, let's also set the Console output code page even if Pseudo
> > > > Console support is disabled; contrary to the behavior of v3.0.7, the
> > > > Console output code page is not ignored in that case.
> > > >
> > > > The most common symptom would be that console applications which do not
> > > > specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> > > > seem to be broken with v3.1.x when they worked plenty fine with v3.0.x.
> > > >
> > > > This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> > > > https://github.com/msys2/MSYS2-packages/issues/2012,
> > > > https://github.com/rust-lang/cargo/issues/8369,
> > > > https://github.com/git-for-windows/git/issues/2734,
> > > > https://github.com/git-for-windows/git/issues/2793,
> > > > https://github.com/git-for-windows/git/issues/2792, and possibly quite a
> > > > few others.
> > > >
> > > > Signed-off-by: Johannes Schindelin 
> > > > ---
> > > >  winsup/cygwin/fhandler_tty.cc | 9 +
> > > >  1 file changed, 9 insertions(+)
> > >
> > > Ok guys, I'm not opposed to this change in terms of its result,
> >
> > I am sorry, but I cannot agree with Johannes's patch.
> >
> > For example, code page in Japan is CP932 by default.
> > In this case, cmd.exe, netsh.exe and so on are generate
> > messages in Japanese.
> >
> > If the code page is set to CP_UTF8, messages from such
> > commands changes to english. I guess similar things happen
> > for other locales.
> >
> > I do not prefer this result.
> >
> > Furthermore, I looked into the issue:
> > https://github.com/git-for-windows/git/issues/2734
> > and I found that git-for-windows always use utf-8
> > encoding even if the locale is ja_JP.CP932.
> > It does not change coding based on locale or code
> > page.
> >
> > Even with Johannes's patch, if mintty is started with
> > locale ja_JP.CP932, the file name will be garbled
> > bacause SetConsoleOutputCP(CP_UTF8) will not be called.
> >
> > IMHO, it is the problem of git-for-windows rather
> > than cygwin and msys2.
> >
> > To make current version of git-for-windows work, it is
> > necessary to set code page to CP_UTF8 regardless locale.
> > This does not make sense at all.
> 
> You are misrepresenting the problem. It has nothing to do with Git for
> Windows. For example, if you run tests in an Angular project inside
> Cygwin's MinTTY, the output will be garbled because node.js (or the
> Angular libraries, I don't know) will interpret `LANG` or something.

You listed these issues in git-for-windows:
> > > > https://github.com/git-for-windows/git/issues/2734,
> > > > https://github.com/git-for-windows/git/issues/2792,
didn't you? Therefore, I looked into them.

OK, I will check Angular/CLI next. But I am not familier with
Agnular/CLI. Could you please provide simple steps to reproduce
the problem?

> This is a much bigger problem than you make it sound. The console
> applications that do _not_ call `SetConsoleOutputCP()` are so
> ubiquituous. I think you are underestimating that problem rather
> dramatically.
> 
> Ciao,
> Johannes


-- 
Takashi Yano 


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Johannes Schindelin
Hi Takashi,


On Wed, 2 Sep 2020, Takashi Yano wrote:

> On Wed, 2 Sep 2020 08:26:04 +0200 (CEST)
> Johannes Schindelin wrote:
> > Hi Takashi,
> >
> > On Wed, 2 Sep 2020, Takashi Yano via Cygwin-patches wrote:
> >
> > > On Wed, 2 Sep 2020 10:30:14 +0200
> > > Corinna Vinschen wrote:
> > > > On Sep  1 18:19, Johannes Schindelin wrote:
> > > > > When `LANG=en_US.UTF-8`, the detected `LCID` is 0x0409, which is
> > > > > correct, but after that (at least if Pseudo Console support is 
> > > > > enabled),
> > > > > we try to find the default code page for that `LCID`, which is ASCII
> > > > > (437). Subsequently, we set the Console output code page to that 
> > > > > value,
> > > > > completely ignoring that we wanted to use UTF-8.
> > > > >
> > > > > Let's not ignore the specifically asked-for UTF-8 character set.
> > > > >
> > > > > While at it, let's also set the Console output code page even if 
> > > > > Pseudo
> > > > > Console support is disabled; contrary to the behavior of v3.0.7, the
> > > > > Console output code page is not ignored in that case.
> > > > >
> > > > > The most common symptom would be that console applications which do 
> > > > > not
> > > > > specifically call `SetConsoleOutputCP()` but output UTF-8-encoded text
> > > > > seem to be broken with v3.1.x when they worked plenty fine with 
> > > > > v3.0.x.
> > > > >
> > > > > This fixes https://github.com/msys2/MSYS2-packages/issues/1974,
> > > > > https://github.com/msys2/MSYS2-packages/issues/2012,
> > > > > https://github.com/rust-lang/cargo/issues/8369,
> > > > > https://github.com/git-for-windows/git/issues/2734,
> > > > > https://github.com/git-for-windows/git/issues/2793,
> > > > > https://github.com/git-for-windows/git/issues/2792, and possibly 
> > > > > quite a
> > > > > few others.
> > > > >
> > > > > Signed-off-by: Johannes Schindelin 
> > > > > ---
> > > > >  winsup/cygwin/fhandler_tty.cc | 9 +
> > > > >  1 file changed, 9 insertions(+)
> > > >
> > > > Ok guys, I'm not opposed to this change in terms of its result,
> > >
> > > I am sorry, but I cannot agree with Johannes's patch.
> > >
> > > For example, code page in Japan is CP932 by default.
> > > In this case, cmd.exe, netsh.exe and so on are generate
> > > messages in Japanese.
> > >
> > > If the code page is set to CP_UTF8, messages from such
> > > commands changes to english. I guess similar things happen
> > > for other locales.
> > >
> > > I do not prefer this result.
> > >
> > > Furthermore, I looked into the issue:
> > > https://github.com/git-for-windows/git/issues/2734
> > > and I found that git-for-windows always use utf-8
> > > encoding even if the locale is ja_JP.CP932.
> > > It does not change coding based on locale or code
> > > page.
> > >
> > > Even with Johannes's patch, if mintty is started with
> > > locale ja_JP.CP932, the file name will be garbled
> > > bacause SetConsoleOutputCP(CP_UTF8) will not be called.
> > >
> > > IMHO, it is the problem of git-for-windows rather
> > > than cygwin and msys2.
> > >
> > > To make current version of git-for-windows work, it is
> > > necessary to set code page to CP_UTF8 regardless locale.
> > > This does not make sense at all.
> >
> > You are misrepresenting the problem. It has nothing to do with Git for
> > Windows. For example, if you run tests in an Angular project inside
> > Cygwin's MinTTY, the output will be garbled because node.js (or the
> > Angular libraries, I don't know) will interpret `LANG` or something.
>
> You listed these issues in git-for-windows:
> > > > > https://github.com/git-for-windows/git/issues/2734,
> > > > > https://github.com/git-for-windows/git/issues/2792,
> didn't you? Therefore, I looked into them.
>
> OK, I will check Angular/CLI next. But I am not familier with
> Agnular/CLI. Could you please provide simple steps to reproduce
> the problem?

Here is a report: https://github.com/git-for-windows/git/issues/2793

But why do you want to look into Angular/CLI? Do you really think that it
makes sense to try to fix every console app out there? Really? Wouldn't it
make more sense to just accept that there are many console applications
that are broken by the recent change, and accommodate them in the Cygwin
runtime? That would take substantially less time.

Ciao,
Johannes

>
> > This is a much bigger problem than you make it sound. The console
> > applications that do _not_ call `SetConsoleOutputCP()` are so
> > ubiquituous. I think you are underestimating that problem rather
> > dramatically.
> >
> > Ciao,
> > Johannes
>
>
> --
> Takashi Yano 
>


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Takashi Yano via Cygwin-patches
On Wed, 2 Sep 2020 11:12:53 +0200 (CEST)
Johannes Schindelin wrote:
> On Wed, 2 Sep 2020, Takashi Yano wrote:
> > OK, I will check Angular/CLI next. But I am not familier with
> > Agnular/CLI. Could you please provide simple steps to reproduce
> > the problem?
> 
> Here is a report: https://github.com/git-for-windows/git/issues/2793

I already read that thread, and I have try following step.

1) Install node.js
2) npm install --global @angular/cli
3) ng new test-app
4) cd test-app
5) ng test --code-coverage

However, the output is very differnt from the bug report,
and there seems to be no garbled output.

The output is something like:

Compiling @angular/core : es2015 as esm2015
Compiling @angular/animations : es2015 as esm2015
Compiling @angular/compiler/testing : es2015 as esm2015
Compiling @angular/common : es2015 as esm2015
Compiling @angular/animations/browser : es2015 as esm2015
Compiling @angular/core/testing : es2015 as esm2015
Compiling @angular/animations/browser/testing : es2015 as esm2015
Compiling @angular/platform-browser : es2015 as esm2015
Compiling @angular/common/http : es2015 as esm2015
Compiling @angular/common/testing : es2015 as esm2015
Compiling @angular/router : es2015 as esm2015
Compiling @angular/forms : es2015 as esm2015
Compiling @angular/platform-browser-dynamic : es2015 as esm2015
Compiling @angular/platform-browser/testing : es2015 as esm2015
Compiling @angular/platform-browser/animations : es2015 as esm2015
Compiling @angular/common/http/testing : es2015 as esm2015
Compiling @angular/platform-browser-dynamic/testing : es2015 as esm2015
Compiling @angular/router/testing : es2015 as esm2015
10% building 2/2 modules 0 active02 09 2020 23:49:25.110:WARN [karma]: No captur
ed browser, open http://localhost:9876/
02 09 2020 23:49:25.118:INFO [karma-server]: Karma v5.0.9 server started at http
://0.0.0.0:9876/
02 09 2020 23:49:25.119:INFO [launcher]: Launching browsers Chrome with concurre
ncy unlimited
02 09 2020 23:49:25.125:INFO [launcher]: Starting browser Chrome
92% additional asset processing copy-webpack-plugin02 09 2020 23:49:34.003:WARN
[karma]: No captured browser, open http://localhost:9876/
02 09 2020 23:49:34.907:INFO [Chrome 85.0.4183.83 (Windows 10)]: Connected on so
cket z7khH23JfW4v-_Tt with id 59608191
Chrome 85.0.4183.83 (Windows 10): Executed 3 of 3 SUCCESS (0.62 secs / 0.273 sec
s)
TOTAL: 3 SUCCESS
TOTAL: 3 SUCCESS
TOTAL: 3 SUCCESS

=== Coverage summary ===
Statements   : 100% ( 6/6 )
Branches : 100% ( 0/0 )
Functions: 100% ( 0/0 )
Lines: 100% ( 5/5 )


What's wrong?

> But why do you want to look into Angular/CLI? Do you really think that it
> makes sense to try to fix every console app out there? Really? Wouldn't it
> make more sense to just accept that there are many console applications
> that are broken by the recent change, and accommodate them in the Cygwin
> runtime? That would take substantially less time.

It is important to know what's happning actually to fix the issue.
Superficial patch makes the problem more complicated.

-- 
Takashi Yano 


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Corinna Vinschen
On Sep  2 19:54, Takashi Yano via Cygwin-patches wrote:
> Hi Corinna,
> 
> On Wed, 2 Sep 2020 10:38:18 +0200
> Corinna Vinschen wrote:
> > On Sep  2 10:30, Corinna Vinschen wrote:
> > > Ok guys, I'm not opposed to this change in terms of its result,
> > > but I'm starting to wonder why all this locale code in fhandler_tty
> > > is necessary at all.
> > > 
> > > I see that get_langinfo() calls __loadlocale and performs a lot of stuff
> > > on the charsets which looks like duplicates of the initial_setlocale()
> > > call performed at DLL startup.
> > > 
> > > If there's anything missing in the initial_setlocale() call which would
> > > be required by the pseudo tty code?  What exactly is it?  The codepage?
> > > And why can't we just add the info to cygheap->locale at 
> > > initial_setlocale()
> > > time so it's available at exec time without going through all this hassle
> > > every time?
> > > 
> > > Apart from that, all this locale/charset/lcid stuff should be concentrated
> > > in nlsfunc.cc ideally.
> > 
> > get_locale_from_env() and get_langinfo() should go away.  If we just
> > need a codepage for get_ttyp ()->term_code_page, we should really find a
> > way to do this from within internal_setlocale().
> 
> I looked into internal_setlocale() code, but I could not found
> the code which handles thecode page. I found the code handling
> the code page in __set_charset_from_locale() function in nlsfuncs.cc,
> but it does not return code page itself. Could you please explain
> more detail of your idea?

I had none yet :)  I was just musing, without actually thinking about a
solution.  But I think this isn't very complicated.  Given this is
inside Cygwin, nothing keeps the function to have a well-defined
side-effect, as in setting a (not yet existing) member "term_code_page"
of cygheap->locale.

Kind of like this:

diff --git a/winsup/cygwin/cygheap.h b/winsup/cygwin/cygheap.h
index 8877cc358c39..2b84f4252071 100644
--- a/winsup/cygwin/cygheap.h
+++ b/winsup/cygwin/cygheap.h
@@ -341,6 +341,7 @@ struct cygheap_debug
 struct cygheap_locale
 {
   mbtowc_p mbtowc;
+  UINT term_code_page;
 };
 
 struct user_heap_info
diff --git a/winsup/cygwin/nlsfuncs.cc b/winsup/cygwin/nlsfuncs.cc
index 668d7eb9e778..752f4239d911 100644
--- a/winsup/cygwin/nlsfuncs.cc
+++ b/winsup/cygwin/nlsfuncs.cc
@@ -1298,6 +1298,9 @@ __set_charset_from_locale (const char *locale, char 
*charset)
LOCALE_IDEFAULTANSICODEPAGE | LOCALE_RETURN_NUMBER,
(PWCHAR) &cp, sizeof cp))
 cp = 0;
+  /* Store codepage in cygheap->locale so fhandler_tty can switch the
+ pseudo console to the correct codepage. */
+  cygheap->locale.term_code_page = cp ?: CP_UTF8;
   /* Translate codepage and lcid to a charset closely aligned with the default
  charsets defined in Glibc. */
   const char *cs;

Make sense?


Thanks,
Corinna


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Corinna Vinschen
On Sep  2 17:24, Corinna Vinschen wrote:
> On Sep  2 19:54, Takashi Yano via Cygwin-patches wrote:
> > Hi Corinna,
> > 
> > On Wed, 2 Sep 2020 10:38:18 +0200
> > Corinna Vinschen wrote:
> > > On Sep  2 10:30, Corinna Vinschen wrote:
> > > > Ok guys, I'm not opposed to this change in terms of its result,
> > > > but I'm starting to wonder why all this locale code in fhandler_tty
> > > > is necessary at all.
> > > > 
> > > > I see that get_langinfo() calls __loadlocale and performs a lot of stuff
> > > > on the charsets which looks like duplicates of the initial_setlocale()
> > > > call performed at DLL startup.
> > > > 
> > > > If there's anything missing in the initial_setlocale() call which would
> > > > be required by the pseudo tty code?  What exactly is it?  The codepage?
> > > > And why can't we just add the info to cygheap->locale at 
> > > > initial_setlocale()
> > > > time so it's available at exec time without going through all this 
> > > > hassle
> > > > every time?
> > > > 
> > > > Apart from that, all this locale/charset/lcid stuff should be 
> > > > concentrated
> > > > in nlsfunc.cc ideally.
> > > 
> > > get_locale_from_env() and get_langinfo() should go away.  If we just
> > > need a codepage for get_ttyp ()->term_code_page, we should really find a
> > > way to do this from within internal_setlocale().
> > 
> > I looked into internal_setlocale() code, but I could not found
> > the code which handles thecode page. I found the code handling
> > the code page in __set_charset_from_locale() function in nlsfuncs.cc,
> > but it does not return code page itself. Could you please explain
> > more detail of your idea?
> 
> I had none yet :)  I was just musing, without actually thinking about a
> solution.  But I think this isn't very complicated.  Given this is
> inside Cygwin, nothing keeps the function to have a well-defined
> side-effect, as in setting a (not yet existing) member "term_code_page"
> of cygheap->locale.
> 
> Kind of like this:

Actually, this is a bit too simple, but you get the idea.  We need to
align the terminal codepage with the actual Cygwin charset, along the
lines of what your setup_locale is doing standalone yet.  Except in case
of ASCII, where we default to UTF-8 internally.  The important part here
is that we do this once, and that we don't have unnecessary code
duplication.


Corinna


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Takashi Yano via Cygwin-patches
Hi Corinna,

On Wed, 2 Sep 2020 17:24:50 +0200
Corinna Vinschen  wrote:
> On Sep  2 19:54, Takashi Yano via Cygwin-patches wrote:
> > Hi Corinna,
> > 
> > On Wed, 2 Sep 2020 10:38:18 +0200
> > Corinna Vinschen wrote:
> > > On Sep  2 10:30, Corinna Vinschen wrote:
> > > > Ok guys, I'm not opposed to this change in terms of its result,
> > > > but I'm starting to wonder why all this locale code in fhandler_tty
> > > > is necessary at all.
> > > > 
> > > > I see that get_langinfo() calls __loadlocale and performs a lot of stuff
> > > > on the charsets which looks like duplicates of the initial_setlocale()
> > > > call performed at DLL startup.
> > > > 
> > > > If there's anything missing in the initial_setlocale() call which would
> > > > be required by the pseudo tty code?  What exactly is it?  The codepage?
> > > > And why can't we just add the info to cygheap->locale at 
> > > > initial_setlocale()
> > > > time so it's available at exec time without going through all this 
> > > > hassle
> > > > every time?
> > > > 
> > > > Apart from that, all this locale/charset/lcid stuff should be 
> > > > concentrated
> > > > in nlsfunc.cc ideally.
> > > 
> > > get_locale_from_env() and get_langinfo() should go away.  If we just
> > > need a codepage for get_ttyp ()->term_code_page, we should really find a
> > > way to do this from within internal_setlocale().
> > 
> > I looked into internal_setlocale() code, but I could not found
> > the code which handles thecode page. I found the code handling
> > the code page in __set_charset_from_locale() function in nlsfuncs.cc,
> > but it does not return code page itself. Could you please explain
> > more detail of your idea?
> 
> I had none yet :)  I was just musing, without actually thinking about a
> solution.  But I think this isn't very complicated.  Given this is
> inside Cygwin, nothing keeps the function to have a well-defined
> side-effect, as in setting a (not yet existing) member "term_code_page"
> of cygheap->locale.
> 
> Kind of like this:
> 
> diff --git a/winsup/cygwin/cygheap.h b/winsup/cygwin/cygheap.h
> index 8877cc358c39..2b84f4252071 100644
> --- a/winsup/cygwin/cygheap.h
> +++ b/winsup/cygwin/cygheap.h
> @@ -341,6 +341,7 @@ struct cygheap_debug
>  struct cygheap_locale
>  {
>mbtowc_p mbtowc;
> +  UINT term_code_page;
>  };
>  
>  struct user_heap_info
> diff --git a/winsup/cygwin/nlsfuncs.cc b/winsup/cygwin/nlsfuncs.cc
> index 668d7eb9e778..752f4239d911 100644
> --- a/winsup/cygwin/nlsfuncs.cc
> +++ b/winsup/cygwin/nlsfuncs.cc
> @@ -1298,6 +1298,9 @@ __set_charset_from_locale (const char *locale, char 
> *charset)
>   LOCALE_IDEFAULTANSICODEPAGE | LOCALE_RETURN_NUMBER,
>   (PWCHAR) &cp, sizeof cp))
>  cp = 0;
> +  /* Store codepage in cygheap->locale so fhandler_tty can switch the
> + pseudo console to the correct codepage. */
> +  cygheap->locale.term_code_page = cp ?: CP_UTF8;
>/* Translate codepage and lcid to a charset closely aligned with the 
> default
>   charsets defined in Glibc. */
>const char *cs;
> 
> Make sense?

I have tried your code, however, it does not work as expected.
It seems that __set_charset_from_locale() is not called.
cygheap->locale.term_code_page is always 0.

I have added following lines into setup_locale() to make sure
to call __set_charset_from_locale() for a test,

  setlocale (LC_ALL, "");
  __set_charset_from_locale (__get_global_locale()->categories[LC_CTYPE], 
charset);
  get_ttyp ()->term_code_page = cygheap->locale.term_code_page;

however, term_code_page is set to 932 if locale is ja_JP.UTF-8.
In this case term_code_page should be CP_UTF8 (65001).

The code page retrieved in __set_charset_from_locale() is not
based on "UTF-8" but "ja_JP".

Let me consider a while.

-- 
Takashi Yano 


Re: [PATCH 3/3] fhandler_pty_slave::setup_locale: respect charset == "UTF-8"

2020-09-02 Thread Corinna Vinschen
Hi Takashi,

On Sep  3 01:25, Takashi Yano via Cygwin-patches wrote:
> Hi Corinna,
> 
> On Wed, 2 Sep 2020 17:24:50 +0200
> Corinna Vinschen  wrote:
> > > > get_locale_from_env() and get_langinfo() should go away.  If we just
> > > > need a codepage for get_ttyp ()->term_code_page, we should really find a
> > > > way to do this from within internal_setlocale().
> > > 
> > > I looked into internal_setlocale() code, but I could not found
> > > the code which handles thecode page. I found the code handling
> > > the code page in __set_charset_from_locale() function in nlsfuncs.cc,
> > > but it does not return code page itself. Could you please explain
> > > more detail of your idea?
> > 
> > I had none yet :)  I was just musing, without actually thinking about a
> > solution.  But I think this isn't very complicated.  Given this is
> > inside Cygwin, nothing keeps the function to have a well-defined
> > side-effect, as in setting a (not yet existing) member "term_code_page"
> > of cygheap->locale.
> > 
> > Kind of like this:
> > [...]
> I have tried your code, however, it does not work as expected.
> It seems that __set_charset_from_locale() is not called.
> cygheap->locale.term_code_page is always 0.

Ah, right!  Take a look into newlib/libc/locale/locale.c, function
__loadlocale().  This function is called from _setlocale_r().  However,
it calls __set_charset_from_locale() *only* if the charset isn't already
given explicitely in the LC_* or LANG string, because otherwise we
already know the charset, after all.

Darn!  That foils my plans for world domination...

> Let me consider a while.

Thanks, I'll do the same.  I'd really like to simplify this stuff
and doing the locale shuffle in two entirely different locations
at different times is prone to getting out of sync.


Corinna