Hi Yasumasa,
Please see:
https://mail.openjdk.org/pipermail/jdk-dev/2023-February/007342.html
Cheers,
David
On 7/02/2023 9:25 pm, Yasumasa Suenaga wrote:
Hi all,
We are discussing about source file encoding in PR #12436 [1]
I saw some C4819 warnings on Windows when I tried to build OpenJDK on
Windows with Japanese locale (CP932). C4819 means the source file
contains characters which cl.exe cannot be handled in the current code
page (CP932 in my case).
I proposed to suppress C4819 in PR #12436, #12437 [2], and #12435 [3]. I
heared JDK folks have discussed about source file encoding in some
times, and it looks like that we expect UTF-8.
So I want to propose to add `-utf-8` to CFLAGS for Windows. What do you
think?
The change is here:
https://github.com/YaSuenag/jdk/commit/272678f8f0a74d893d98b507f2c0562bff900b9d
In GCC, the compiler expects UTF-8 as a source file encoding [4].
OTOH cl.exe will use current user code page when the source does not
have BOM [5] in Windows. So I think we should think about Linux (in
other platforms eg macOS, I guess we can ignore because we haven't see
any reports which relate to the locale, and they can be set the locale
straightly - WSL cannot do so).
This proposal affects all native components in JDK, so I want to discuss
about this topic before filing this to JBS and sending PR for this.
And also I think we should describe about source file encoding in some
place. It may be "Operating System Requirements" in building.md . Let me
know if better place.
Thanks,
Yasumasa
[1] https://github.com/openjdk/jdk/pull/12436
[2] https://github.com/openjdk/jdk/pull/12437
[3] https://github.com/openjdk/jdk/pull/12435
[4] https://gcc.gnu.org/onlinedocs/gcc-12.2.0/cpp/Character-sets.html
[5]
https://learn.microsoft.com/en-us/cpp/build/reference/utf-8-set-source-and-executable-character-sets-to-utf-8?view=msvc-170