Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Costas Argyris
Does this support require Make to be linked against the UCRT
run-time library, or does it also work with the older MSVCRT?

I haven't found anything explicitly mentioned about this in the official
doc:

https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

Also, it is possible to apply the manifest even post-compilation of the
executable, using mt.exe (MS standard workflow) on it, so it shouldn't
matter if it is linked against either one because it can be done even
after the link phase.Not sure if that's a convincing argument though.

If Make is built with MSVC, does it have to be built with some new
enough version of Studio to have the necessary run-time support
for this feature, or any version will do?

I haven't built Make with MSVC at all (patch is focused on building with
GNU tools) but again there is no mention of this in the official doc above.

It is just another case of using a manifest file, where this time the
manifest
is used to set the active code page of the process to UTF-8.

In fact, the manifest can be embedded into the target executable even
post-compilation, using mt.exe, so I don't think a recent version of VS
is a requirement to build properly.

Does using UTF-8 as the active page in Make mean that locale-dependent
C library functions will behave as expected?

I think so.Here is the relevant doc I found:

https://learn.microsoft.com/en-us/cpp/text/locales-and-code-pages?view=msvc-170

where the interesting bits are those where "operating system" is mentioned,
like:

"Also, the run-time library might obtain and use the value of the operating
system
code page, which is constant for the duration of the program's execution."

I believe with setting the active code page of the process to UTF-8 we
are effectively forcing the process to think that the operating system
code page is UTF-8, as far as that process is concerned.

Did you try running Make with this manifest on older Windows systems,
like Windows 8.1 or 7?  It is important to make sure this manifest doesn't
preclude Make from running on those older systems, even though the
UTF-8 feature will then be unavailable.

I did not try as I don't have access to such systems, but it seems pretty
clear from the doc that this should not be a problem:

"You can declare this property and target/run on earlier Windows builds,
but you must handle legacy code page detection and conversion as usual.
With a minimum target version of Windows Version 1903, the process code
page will always be UTF-8 so legacy code page detection and conversion can
be avoided."

It sounds like it will simply not use UTF-8, meaning that any UTF-8 input
would still cause Make to break, but that would happen anyway with such
input.Based on the above, it shouldn't change existing behavior in these
older systems, and certainly not stop Make from running on them.

When Make invokes other programs (which it does quite a lot ;-),
and passes command-line arguments to it with non-ASCII characters,
what will happen to those non-ASCII characters?

I think your expectation is correct. Windows seems to be converting the
UTF-8
encoded strings to the current ANSI codepage, therefore allowing non-ASCII
characters (that are part of that ANSI codepage) to be propagated to the
non-UTF-8 program.

Below are some experiments to show this.

In what follows, 'mingw32-make' is today's (unpatched) Make for Windows, as
found in a typical mingw build distribution.Since it is unpatched, it
is using
the local ANSI codepage which is windows-1252 in my machine.

'make' is the patched version which uses the UTF-8 codepage.

Makefile 'windows-1252-non-ascii.mk' is encoded in 1252 and has content:

hello :
gcc ©\src.c -o ©\src.exe

where the (extended ASCII) Copyright sign has been used (0xA9 in 1252).

Makefile 'utf8.mk' has the same content but is encoded in UTF-8, so the
Copyright sign is represented as 0xC2 0xA9 (two-byte UTF-8 sequence,
confirmed by looking through hex editor).

With the unpatched Make that uses the local codepage:

mingw32-make -f windows-1252-non-ascii.mk

works fine and produces the .exe under the copyright folder (current
behavior).

mingw32-make -f utf8.mk

breaks because the unpatched make can't understand the UTF-8 file
(expected).

With the patched Make that uses the UTF-8 codepage:

make -f windows-1252-non-ascii.mk

breaks because Make expects UTF-8 and we are feeding it with a 1252 file.

make -f utf8.mk

works fine and produces the .exe under the copyright folder.

I believe this last case is the one that answers your question:

Make (now working in UTF-8) calls gcc (working in 1252) with some UTF-8
encoded arguments.gcc has no problem doing the compilation and
producing the executable under the Copyright folder, which suggests that
Windows did indeed convert the UTF-8 arguments into gcc's codepage (1252),
and because the Copyright sign does exist in 1252 the conversion was
successful, allowing gcc to run.

So it doesn't 

Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Paul Smith
On Sun, 2023-03-19 at 13:42 +, Costas Argyris wrote:
> I cross-compiled Make for Windows using gcc (mingw-w64) and the
> autoconf + automake + configure + make approach, so it clearly worked
> for me, but I didn't imagine that this wasn't the standard way to
> build for Windows host.

There is no one "standard way".  The GNU project doesn't provide
binaries on any platform, including Windows: we only provide source
code.  So whatever methods people use to build the software is the
"standard way" for them.

I don't do Windows: my system did not come with Windows and I don't own
a Windows license, so when I test before releasing these days I use the
free developer Windows VM provided by Microsoft.  It expires regularly
so I don't spend time customizing it.  Because of that I personally use
MSVC (the latest version, which comes pre-installed on the VM) and the
build_w32.bat file, and I have Git for Windows POSIX tools on my PATH.
I install Strawberry Perl to be able to run the regression tests, and
that's it.

This is only intended to be a trivial, anti-brown-paper-bag test for
coding errors or obvious regressions.

Other people (like Eli who is the primary maintainer of GNU Make for
Windows) have other environments and do more vigorous testing.  But I
don't believe Eli uses autotools on Windows, either.

> Does this mean that all builds of Make found in the various build
> distributions of the GNU toolchain for Windows (like mingw32-make.exe
> in the examples above) were necessarily built using build_w32.bat?

You will have to ask each of them.  They all do their own thing and we
must be doing an OK job of keeping things portable, since they rarely
come back to us with requests of any kind so we don't really know what
they are doing.

> Assuming all questions are answered first, would it be OK to work on
> the build_w32.bat changes in a second separate patch, and keep the
> first one focused only on the Unix-like build process?

Patches can be provided in any order, but until build_w32.bat is
updated there won't be any testing of these features during the
"normal" development process.  Presumably (but again, you'll have to
ask them) the MinGW folks etc. will take release candidate builds and
verify them in their own environments, once those become available.

This is not to discourage you in any way: UTF-8 is assumed by GNU Make
on POSIX systems and getting that to be true on Windows is a big step
in the right direction IMO!



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Paul Smith
On Sat, 2023-03-18 at 16:37 +, Costas Argyris wrote:
> The attached patch incorporates the UTF-8 manifest into the build
> process of GNU Make when hosted on Windows, and forces the
> built executable to use UTF-8 as its active code page, solving all
> problems shown above because this has a global effect in the process.

Thanks for this patch!  I'll let Eli comment on the Windows/resource
parts and I'll look at the autotools parts.

It would be nice if there was a regression test or two created that
would show this behavior.



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Eli Zaretskii
> From: Costas Argyris 
> Date: Sun, 19 Mar 2023 13:42:52 +
> Cc: bug-make@gnu.org
> 
> Does this support require Make to be linked against the UCRT
> run-time library, or does it also work with the older MSVCRT?
> 
> I haven't found anything explicitly mentioned about this in the official
> doc:
> 
> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

OK, but how is the make.exe you produced built? is it using UCRT or
MSVCRT when it runs?  You can check that by examining the dependencies
of the .exe file with, e.g., the Dependency Walker program
(https://www.dependencywalker.com/) or similar.  Or just use objdump
from GNU Binutils:

  objdump -p make.exe | fgrep "DLL Name:"

and see if this shows MSVCRT.DLL or the UCRT one.

> Does using UTF-8 as the active page in Make mean that locale-dependent
> C library functions will behave as expected?
> 
> I think so.Here is the relevant doc I found:
> 
> https://learn.microsoft.com/en-us/cpp/text/locales-and-code-pages?view=msvc-170

This is not enough.  If locale-dependent C library function still
support only the characters expressible with the ANSI codepage, then a
program using the UTF-8 active codepage will be unable to process the
non-ASCII characters outside of the ANSI codepage correctly.  For
example, downcasing such characters or comparing them in
case-insensitive manner will not work.  This is because for this to
work those functions need to have access to tables of character
properties for the entire Unicode range, not just for the current
locale.  If you try using in a Makefile file names with non-ASCII
characters outside of the current ANSI codepage, does Make succeed to
recognize files mentioned in the Makefile whose letter-case is
different from what is seen in the file system?

> Also, since the above experiments seem to suggest that we are not
> dropping existing support for non-ASCII characters in programs
> called by Make, it seems like a clear step forwards in terms of
> Unicode support on Windows.

I agree.

> I cross-compiled Make for Windows using gcc (mingw-w64) and the
> autoconf + automake + configure + make approach, so it clearly worked
> for me, but I didn't imagine that this wasn't the standard way to build for
> Windows host.

Make is a basic utility used to built others, so we don't require a
full suite of build tools for building Make itself.

> Does this mean that all builds of Make found in the various build
> distributions of the GNU toolchain for Windows (like
> mingw32-make.exe in the examples above) were necessarily built using
> build_w32.bat?

I don't know.  I can tell you that the precompiled binaries I make
available here:

  https://sourceforge.net/projects/ezwinports/files/

are produced by running that batch file.

> Since build_w32.bat is a Windows-specific batch file, does this rule out
> cross-compilation as a canonical way to build Make for Windows?

No, it doesn't rule that out.  But using cross-compilation is not very
important these days, since one can have a fully functional MinGW
build environment quite easily.

> Assuming all questions are answered first, would it be OK to work on the
> build_w32.bat changes in a second separate patch, and keep the first one
> focused only on the Unix-like build process?

Yes.  But my point is that without also changing build_w32.bat the
change is incomplete.



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Eli Zaretskii
> From: Paul Smith 
> Cc: bug-make@gnu.org
> Date: Sun, 19 Mar 2023 10:27:16 -0400
> 
> Other people (like Eli who is the primary maintainer of GNU Make for
> Windows) have other environments and do more vigorous testing.  But I
> don't believe Eli uses autotools on Windows, either.

I do use autotools on Windows, just not for building GNU Make.

> > Assuming all questions are answered first, would it be OK to work on
> > the build_w32.bat changes in a second separate patch, and keep the
> > first one focused only on the Unix-like build process?
> 
> Patches can be provided in any order, but until build_w32.bat is
> updated there won't be any testing of these features during the
> "normal" development process.  Presumably (but again, you'll have to
> ask them) the MinGW folks etc. will take release candidate builds and
> verify them in their own environments, once those become available.
> 
> This is not to discourage you in any way: UTF-8 is assumed by GNU Make
> on POSIX systems and getting that to be true on Windows is a big step
> in the right direction IMO!

Indeed.  But build_w32.bat is a very simple batch file, so I don't
think modifying it will present any difficulty.  Let us know if you
need help in that matter.



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Eli Zaretskii
> From: Paul Smith 
> Date: Sun, 19 Mar 2023 10:32:36 -0400
> 
> It would be nice if there was a regression test or two created that
> would show this behavior.

If we add tests for this feature (and I agree it's desirable), we
should generate the files with non-ASCII names for those tests as part
of the test script, not having them ready in the repository and the
tarball.  That's because unpacking a tarball with non-ASCII characters
and/or having them in Git will immediately cause problems on Windows,
where the unpacking tools and at least some versions of Git for
Windows cannot cope with arbitrary non-ASCII file names.  The Texinfo
project had quite a few similar problems, and ended up generating the
files as part of running the test suite as the only viable solution.



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Paul Smith
On Sun, 2023-03-19 at 16:47 +0200, Eli Zaretskii wrote:
> If we add tests for this feature (and I agree it's desirable), we
> should generate the files with non-ASCII names for those tests as
> part of the test script, not having them ready in the repository and
> the tarball.

Agreed for sure; plus that's how all the tests work today (create the
test files they are going to use and delete them again after) so new
tests should follow that precedent.



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Eli Zaretskii
> Date: Sun, 19 Mar 2023 16:38:08 +0200
> From: Eli Zaretskii 
> Cc: bug-make@gnu.org
> 
> > From: Costas Argyris 
> > Date: Sun, 19 Mar 2023 13:42:52 +
> > Cc: bug-make@gnu.org
> > 
> > Also, since the above experiments seem to suggest that we are not
> > dropping existing support for non-ASCII characters in programs
> > called by Make, it seems like a clear step forwards in terms of
> > Unicode support on Windows.
> 
> I agree.

Btw, there's one aspect where Make on MS-Windows will probably fall
short of modern Posix systems: the display of non-ASCII characters on
the screen.  Such as the "Entering directory FOO" and echo of the
commands being run by Make.  A typical Windows console (a.k.a.
"Command Prompt" window) can display non-ASCII characters only from a
handful of scripts due to limitations of the fonts used for these
windows, and in addition displaying UTF-8 encoded characters in these
windows using printf etc. doesn't work well.  So users who use such
non-ASCII characters in their Makefiles should expect a lot of
mojibake on the screen.



Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Costas Argyris
OK, but how is the make.exe you produced built?

I actually did what you suggested but was somewhat confused with the
result.Usually I do this with 'ldd', but both msvcrt.dll and
ucrtbase.dll
show up in 'ldd make.exe' output, and I wasn't sure what to think of it.

However, your approach with objdump gives fewer results and only
lists msvcrt.dll, not ucrtbase.dll:

C:\Users\cargyris\temp>objdump -p make.exe | grep "DLL Name:"
DLL Name: ADVAPI32.dll
DLL Name: KERNEL32.dll
DLL Name: msvcrt.dll
DLL Name: USER32.dll

So I guess MSVCRT is enough, i.e. no need for UCRT.

If you try using in a Makefile file names with non-ASCII
characters outside of the current ANSI codepage, does Make succeed to
recognize files mentioned in the Makefile whose letter-case is
different from what is seen in the file system?

I think it does, here is the experiment:

C:\Users\cargyris\temp>ls ❎
 src.c

There is only src.c in that folder.

Makefile utf8.mk is UTF-8 encoded and has this content that
checks for the existence of:

❎\src.c
❎\src.C
❎\src.cs

where ❎ is outside the ANSI codepage (1252).

If I understand this correctly, both src.c and src.C should be found,
but not src.cs (just to show a negative case as well).

hello :
@gcc ©\src.c -o ©\src.exe



ifneq ("$(wildcard ❎\src.c)","")
@echo ❎\src.c exists
else
@echo ❎\src.c does NOT exist
endif



ifneq ("$(wildcard ❎\src.C)","")
@echo ❎\src.C exists
else
@echo ❎\src.C does NOT exist
endif



ifneq ("$(wildcard ❎\src.cs)","")
@echo ❎\src.cs exists
else
@echo ❎\src.cs does NOT exist
endif

Here is the result of running the UTF-8-patched Make on it:

C:\Users\cargyris\temp>make.exe -f utf8.mk
❎\src.c exists
❎\src.C exists
❎\src.cs does NOT exist

I don't know if that was a good way to test your point, feel free to suggest
a different one if it was not.It seems to be doing the right thing,
finding
the .C file as well.

Indeed.  But build_w32.bat is a very simple batch file, so I don't
think modifying it will present any difficulty.  Let us know if you
need help in that matter.

Sure, thanks.

Btw, there's one aspect where Make on MS-Windows will probably fall
short of modern Posix systems: the display of non-ASCII characters on
the screen.

Indeed, some thoughts on that:

1) As you know, this is only affecting the visual aspect of the logs, not
the
inner workings of Make.This could confuse users because they would
be seeing "errors" on the screen, without there being any real errors.
Perhaps a mention in the doc or release notes could remedy that.

2) To some extent (maybe even completely, I don't know) this can be
mitigated with using PowerShell instead of the classic Command Prompt.
This seems to be working in this case at least:

Command Prompt:

C:\Users\cargyris\temp>make.exe -f utf8.mk
echo â?Z\src.c exists

PowerShell:

PS C:\Users\cargyris\temp> make.exe -f utf8.mk
echo ❎\src.c exists

If anything, it could be worth a mention in the doc.

On Sun, 19 Mar 2023 at 14:38, Eli Zaretskii  wrote:

> > From: Costas Argyris 
> > Date: Sun, 19 Mar 2023 13:42:52 +
> > Cc: bug-make@gnu.org
> >
> > Does this support require Make to be linked against the UCRT
> > run-time library, or does it also work with the older MSVCRT?
> >
> > I haven't found anything explicitly mentioned about this in the official
> > doc:
> >
> >
> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page
>
> OK, but how is the make.exe you produced built? is it using UCRT or
> MSVCRT when it runs?  You can check that by examining the dependencies
> of the .exe file with, e.g., the Dependency Walker program
> (https://www.dependencywalker.com/) or similar.  Or just use objdump
> from GNU Binutils:
>
>   objdump -p make.exe | fgrep "DLL Name:"
>
> and see if this shows MSVCRT.DLL or the UCRT one.
>
> > Does using UTF-8 as the active page in Make mean that locale-dependent
> > C library functions will behave as expected?
> >
> > I think so.Here is the relevant doc I found:
> >
> >
> https://learn.microsoft.com/en-us/cpp/text/locales-and-code-pages?view=msvc-170
>
> This is not enough.  If locale-dependent C library function still
> support only the characters expressible with the ANSI codepage, then a
> program using the UTF-8 active codepage will be unable to process the
> non-ASCII characters outside of the ANSI codepage correctly.  For
> example, downcasing such characters or comparing them in
> case-insensitive manner will not work.  This is because for this to
> work those functions need to have access to tables of character
> properties for the entire Unicode range, not just for the current
> locale.  If you try using in a Makefile file names with non-ASCII
> characters outside of the current ANSI codepage, does Make succeed to
> recognize files mentioned in the Makefile whose letter-case is
> different from what is seen in the file system?
>
> > Also, since the above experiments seem to suggest that we are not
> > dropping existin

Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Eli Zaretskii
> From: Costas Argyris 
> Date: Sun, 19 Mar 2023 16:34:54 +
> Cc: bug-make@gnu.org, psm...@gnu.org
> 
> > OK, but how is the make.exe you produced built?
> 
> I actually did what you suggested but was somewhat confused with the
> result.Usually I do this with 'ldd', but both msvcrt.dll and ucrtbase.dll
> show up in 'ldd make.exe' output, and I wasn't sure what to think of it.
> 
> However, your approach with objdump gives fewer results and only
> lists msvcrt.dll, not ucrtbase.dll:
> 
> C:\Users\cargyris\temp>objdump -p make.exe | grep "DLL Name:"
> DLL Name: ADVAPI32.dll
> DLL Name: KERNEL32.dll
> DLL Name: msvcrt.dll
> DLL Name: USER32.dll
> 
> So I guess MSVCRT is enough, i.e. no need for UCRT.

Yes, thanks.

> > If you try using in a Makefile file names with non-ASCII
> > characters outside of the current ANSI codepage, does Make succeed to
> > recognize files mentioned in the Makefile whose letter-case is
> > different from what is seen in the file system?
> 
> I think it does, here is the experiment:
> 
> C:\Users\cargyris\temp>ls ❎
>  src.c
> 
> There is only src.c in that folder.
> 
> Makefile utf8.mk is UTF-8 encoded and has this content that
> checks for the existence of:
> 
> ❎\src.c
> ❎\src.C
> ❎\src.cs
> 
> where ❎ is outside the ANSI codepage (1252).

That's not a good experiment, IMO: the only non-ASCII character here
is U+274E, which has no case variants.  And the characters whose
letter-case you tried to change are all ASCII, so their case
conversions are unaffected by the locale.

> If I understand this correctly, both src.c and src.C should be found,
> but not src.cs (just to show a negative case as well).

In addition, I'm not sure Make actually compares file names somewhere,
I think it just calls 'stat', and that is of course case-insensitive
(because the filesystem is on the base level).

My guess would be that only characters within the locale, defined by
the ANSI codepage, are supported by locale-aware functions in the C
runtime.  That's because this is what happens even if you use "wide"
Unicode APIs and/or functions like _wcsicmp that accept wchar_t
characters: they all support only the characters of the current locale
set by 'setlocale'.  I don't expect that to change just because UTF-8
is used on the outside: internally, everything is converted to UTF-16,
i.e. to the Windows flavor of wchar_t.

> > Btw, there's one aspect where Make on MS-Windows will probably fall
> > short of modern Posix systems: the display of non-ASCII characters on
> > the screen.
> 
> Indeed, some thoughts on that:
> 
> 1) As you know, this is only affecting the visual aspect of the logs, not the
> inner workings of Make.This could confuse users because they would
> be seeing "errors" on the screen, without there being any real errors.
> Perhaps a mention in the doc or release notes could remedy that.
> 
> 2) To some extent (maybe even completely, I don't know) this can be
> mitigated with using PowerShell instead of the classic Command Prompt.
> This seems to be working in this case at least:

This could be just sheer luck: PowerShell uses a font that supports
that particular character.  The basic problem here is that "Command
Prompt" windows don't allow to configure more than one font for
displaying characters, and a single font can never support more than a
few scripts.  If PowerShell doesn't allow more than a single font in
its windows, it will suffer from the same problem.

> If anything, it could be worth a mention in the doc.

Yes, of course.



[bug #63856] .WAIT does not work as special target on command line.

2023-03-19 Thread Dmitry Goncharov
Additional Item Attachment, bug #63856 (project make):

File name: sv63856_fix2_part1.diffSize:11 KB


File name: sv63856_fix2_part2.diffSize:15 KB




___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




[bug #63856] .WAIT does not work as special target on command line.

2023-03-19 Thread Dmitry Goncharov
Follow-up Comment #2, bug #63856 (project make):

Paul, please disregard sv63856_part1.diff and sv63856_part2.diff.
sv63856_fix2_part1.diff and sv63856_fix2_part2.diff contain updated versions
of this fix.


___

Reply to this item at:

  

___
Message sent via Savannah
https://savannah.gnu.org/




Re: [PATCH] Use UTF-8 active code page for Windows host.

2023-03-19 Thread Costas Argyris
That's not a good experiment, IMO: the only non-ASCII character here
is U+274E, which has no case variants.  And the characters whose
letter-case you tried to change are all ASCII, so their case
conversions are unaffected by the locale.

OK I think this is a better one, it is using U+03B2 and U+0392 which
are the lower and upper case of the same letter (β and Β).

I create a file src.β first:

touch src.β

and then run the following UTF-8 encoded Makefile:

hello :
@gcc ©\src.c -o ©\src.exe

ifneq ("$(wildcard src.β)","")
@echo src.β exists
else
@echo src.β does NOT exist
endif



ifneq ("$(wildcard src.Β)","")
@echo src.Β exists
else
@echo src.Β does NOT exist
endif



ifneq ("$(wildcard src.βΒ)","")
@echo src.βΒ exists
else
@echo src.βΒ does NOT exist
endif

and the output of Make is:

C:\Users\cargyris\temp>make -f utf8.mk
src.β exists
src.Β exists
src.βΒ does NOT exist

which shows that it finds the one with the upper case extension as well,
despite the fact that it exists in the file system as a lower case
extension.

My guess would be that only characters within the locale, defined by
the ANSI codepage, are supported by locale-aware functions in the C
runtime.  That's because this is what happens even if you use "wide"
Unicode APIs and/or functions like _wcsicmp that accept wchar_t
characters: they all support only the characters of the current locale
set by 'setlocale'.  I don't expect that to change just because UTF-8
is used on the outside: internally, everything is converted to UTF-16,
i.e. to the Windows flavor of wchar_t.

When the manifest is used to set the active code page of the process
to UTF-8, the current ANSI code page does become UTF-8, so that
might explain why the above example is working.

As mentioned in:

https://learn.microsoft.com/en-us/cpp/text/locales-and-code-pages?view=msvc-170

"Also, the run-time library might obtain and use the value of the operating
system code page, which is constant for the duration of the program's
execution."

This seems to be offering some kind of confirmation.

But this one looks most relevant to your point:

https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170#utf-8-support

"Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C
Runtime supports using a UTF-8 code page. The change means that char
strings passed to C runtime functions can expect strings in the UTF-8
encoding. To enable UTF-8 mode, use ".UTF8" as the code page when using
setlocale. For example, setlocale(LC_ALL, ".UTF8") will use the current
default Windows ANSI code page (ACP) for the locale and UTF-8 for the code
page."

src/main.c:1245 has:

setlocale (LC_ALL, "");

so this could be changed to:

setlocale (LC_ALL, ".UTF8")

conditionally on the Windows version above, but I'm not sure if that is even
necessary, given the UTF-8 manifest change.

>From reading the above doc my understanding is that embedding the UTF-8
manifest has an effect that covers the C runtime as well.For example:

"UTF-8 mode is also enabled for functions that have historically translated
char strings using the default Windows *ANSI code page (ACP)*. For example,
calling _mkdir("😊") while using a UTF-8 code page will correctly produce a
directory with that emoji as the folder name, *instead of requiring the ACP
to be changed to UTF-8 before running your program.* Likewise, calling
_getcwd() in that folder will return a UTF-8 encoded string. *For
compatibility, the ACP is still used if the C locale code page isn't set to
UTF-8.*"

I have highlighted the important parts in bold.

My point is, with the manifest embedded at build time, ACP will be UTF-8
already when the program (Make) runs, so no need to do anything more.

This advice is for how to use UTF-8 in the C runtime if you don't have
ACP == UTF-8.

The Unicode -W APIs are different compared to the -A APIs in that
they don't even look at the current ANSI code page, they just use UTF-16.


On Sun, 19 Mar 2023 at 17:01, Eli Zaretskii  wrote:

> > From: Costas Argyris 
> > Date: Sun, 19 Mar 2023 16:34:54 +
> > Cc: bug-make@gnu.org, psm...@gnu.org
> >
> > > OK, but how is the make.exe you produced built?
> >
> > I actually did what you suggested but was somewhat confused with the
> > result.Usually I do this with 'ldd', but both msvcrt.dll and
> ucrtbase.dll
> > show up in 'ldd make.exe' output, and I wasn't sure what to think of it.
> >
> > However, your approach with objdump gives fewer results and only
> > lists msvcrt.dll, not ucrtbase.dll:
> >
> > C:\Users\cargyris\temp>objdump -p make.exe | grep "DLL Name:"
> > DLL Name: ADVAPI32.dll
> > DLL Name: KERNEL32.dll
> > DLL Name: msvcrt.dll
> > DLL Name: USER32.dll
> >
> > So I guess MSVCRT is enough, i.e. no need for UCRT.
>
> Yes, thanks.
>
> > > If you try using in a Makefile file names with non-ASCII
> > > characters outside of the current ANSI codepage, does Make succeed to
> > > recognize files ment

Two 'make -p' timestamp issues

2023-03-19 Thread Paul Eggert
In some unusual, timing-dependent cases on some platforms, 'make -p' can 
generate output that incorrectly makes it look like the system clock ran 
backwards, when it didn't.


Also, there's a torture-test case (system clock in the far past or 
future) where 'make -p' can dump core.


Proposed patch attached.
From 2e6355c4f3813c346ded2a90997a4308114d2893 Mon Sep 17 00:00:00 2001
From: Paul Eggert 
Date: Sun, 19 Mar 2023 14:40:39 -0700
Subject: [PATCH] Fix clock skew / crash issue with 'make -p'

* NEWS: Mention this.
* src/main.c (print_data_base): Use file_timestamp_sprintf
instead of time+ctime, to avoid inconsistent clocks.  See:
https://www.gnu.org/software/gnulib/manual/html_node/time.html
https://sourceware.org/bugzilla/show_bug.cgi?id=30200
This avoids output that incorrectly implies that the clock ran backwards.
Avoiding ctime also means 'make' won't have undefined behavior
if ctime crashes or returns a null pointer, which can happen if
the system clock is set far in the past or future.
---
 NEWS   |  5 +
 src/main.c | 10 ++
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 4ab15136..dcce8d93 100644
--- a/NEWS
+++ b/NEWS
@@ -9,6 +9,11 @@ which is contained in this distribution as the file doc/make.texi.
 See the README file and the GNU Make manual for instructions for
 reporting bugs.
 
+* 'make --print-data-base' (or 'make -p') now outputs more-consistent
+  timestamps, e.g., "2023-03-19 14:23:42.570558743".  Previously it
+  sometimes also used the form "Sun Mar 19 14:23:42 2023", and
+  sometimes used a clock that ticked slightly inconsistently.
+
 
 Version 4.4.1 (26 Feb 2023)
 
diff --git a/src/main.c b/src/main.c
index a9d3a644..2a02e399 100644
--- a/src/main.c
+++ b/src/main.c
@@ -3744,11 +3744,13 @@ print_version (void)
 static void
 print_data_base (void)
 {
-  time_t when = time ((time_t *) 0);
+  int resolution;
+  char buf[FILE_TIMESTAMP_PRINT_LEN_BOUND + 1];
+  file_timestamp_sprintf (buf, file_timestamp_now (&resolution));
 
   print_version ();
 
-  printf (_("\n# Make data base, printed on %s"), ctime (&when));
+  printf (_("\n# Make data base, printed on %s"), buf);
 
   print_variable_data_base ();
   print_dir_data_base ();
@@ -3757,8 +3759,8 @@ print_data_base (void)
   print_vpath_data_base ();
   strcache_print_stats ("#");
 
-  when = time ((time_t *) 0);
-  printf (_("\n# Finished Make data base on %s\n"), ctime (&when));
+  file_timestamp_sprintf (buf, file_timestamp_now (&resolution));
+  printf (_("\n# Finished Make data base on %s\n"), buf);
 }
 
 static void
-- 
2.37.2