On Wed, 13 Sep 2023 17:38:28 GMT, Justin Lu <j...@openjdk.org> wrote:
>> JDK .properties files still use ISO-8859-1 encoding with escape sequences. >> It would improve readability to see the native characters instead of escape >> sequences (especially for the L10n process). The majority of files changed >> are localized resource files. >> >> This change converts the Unicode escape sequences in the JDK .properties >> files (both in src and test) to UTF-8 native characters. Additionally, the >> build logic is adjusted to read the .properties files in UTF-8 while >> generating the ListResourceBundle files. >> >> The only escape sequence not converted was `\u0020` as this is used to >> denote intentional trailing white space. (E.g. `key=This is the >> value:\u0020`) >> >> The conversion was done using native2ascii with options `-reverse -encoding >> UTF-8`. >> >> If this PR is integrated, the IDE default encoding for .properties files >> need to be updated to UTF-8. (IntelliJ IDEA locks .properties files as >> ISO-8859-1 unless manually changed). > > Justin Lu has updated the pull request incrementally with one additional > commit since the last revision: > > Replace InputStreamReader with BufferedReader FWIW, I checked out the revision of the commit previous to this change and found the following: % git checkout b55e418a077791b39992042411cde97f68dc39fe^ % find src -name "*.properties" | xargs file | grep -v ASCII src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties: ISO-8859 text src/java.xml.crypto/share/classes/com/sun/org/apache/xml/internal/security/resource/xmlsecurity_de.properties: Unicode text, UTF-8 text, with very long lines (322) Which indicates that that this is the only non-ASCII, non-UTF-8 property file. So we may be lucky. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15694#issuecomment-2792014164