Hi Fabian,

On 11/5/24 12:52 AM, Fabian Meumertzheim wrote:
On Mon, Nov 4, 2024 at 8:46 PM Naoto Sato <naoto.s...@oracle.com> wrote:
I am afraid that the risk that would be involved in configuring
sun.jnu.encoding exceeds the benefit it would bring, as the encoding is
so baked in the basis of the Windows Java runtime. Since Microsoft
itself now recommends users choose UTF-8 as the ANSI code page (over
changing apps to use -W APIs)[1], I think we would want to wait for that
glorious day.

Naoto

[1]
https://urldefense.com/v3/__https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page*-a-vs--w-apis__;Iw!!ACWV5N9M2RV99hQ!JcTCVhRAZCQZaQWCt8WQJ8oN31jpETS4danV6j-3PXtKlK9ffLVuPY0G-XEooSus0sCFYoCNx-dJNyyNSmdRzg$

My understanding of that page is that Microsoft recommends
*application developers* to choose UTF-8 as the code page for their
apps by adding a directive to their app manifest. While this works
well for native applications, it doesn't directly apply to Java
applications as the manifest is that of the java.exe launcher binary,
which is necessarily static (and currently doesn't set the
`activeCodePage` directive).

Yes, the article is for app developers, but my intention quoting that specific paragraph (-A vs -W) was to point out Microsoft's directional change:

```
Until recently, Windows has emphasized "Unicode" -W variants over -A APIs. However, recent releases have used the ANSI code page and -A APIs as a means to introduce UTF-8 support to apps. If the ANSI code page is configured for UTF-8, then -A APIs typically operate in UTF-8. This model has the benefit of supporting existing code built with -A APIs without any code changes.
```

This was a 180 degree direction change, which lets ANSI based apps (including Java launcher) work without any changes in apps side.


We could choose to rely on users switching to the UTF-8 codepage
system-wide. This is possible as of the 1809 build of Windows 10, but
is not the default, still marked as Beta in the latest version,
requires admin privileges to enable, and can break other applications,
even of other users. This may become the default some day, but it's
unclear whether this will happen in the foreseeable future, especially
since there is a backwards compatible alternative for native
applications.

I cannot speak for MS, but I read the article as the day will still come, when UTF-8 becomes the default on Windows.


I understand that incrementally refactoring the Windows Java runtime
until its encoding becomes configurable is too risky. Taking that into
account, what do you think of offering an additional entrypoint for
the Java launcher on Windows, say java-utf8.exe, that is identical to
java.exe except that it specifies
`<activeCodePage>UTF-8</activeCodePage>` in its app manifest? This
would give users the desired opt-in behavior with no changes to the
actual implementation of the Java runtime. (In fact, in my concrete
use case, we are relying on this as a workaround by patching the
manifest in java.exe with a tool [1].)

Yes, it would be possible if two launchers were provided. However, please note that it would also require the maintenance cost doubled. Some JDK distributors may be interested, but I am not sure it would be implemented in the OpenJDK Windows reference implementation.

Naoto

Reply via email to