On Wed, 13 Sep 2023 17:38:28 GMT, Justin Lu <j...@openjdk.org> wrote:

>> JDK .properties files still use ISO-8859-1 encoding with escape sequences. 
>> It would improve readability to see the native characters instead of escape 
>> sequences (especially for the L10n process). The majority of files changed 
>> are localized resource files.
>> 
>> This change converts the Unicode escape sequences in the JDK .properties 
>> files (both in src and test) to UTF-8 native characters. Additionally, the 
>> build logic is adjusted to read the .properties files in UTF-8 while 
>> generating the ListResourceBundle files.
>> 
>> The only escape sequence not converted was `\u0020` as this is used to 
>> denote intentional trailing white space. (E.g. `key=This is the 
>> value:\u0020`)
>> 
>> The conversion was done using native2ascii with options `-reverse -encoding 
>> UTF-8`.
>> 
>> If this PR is integrated, the IDE default encoding for .properties files 
>> need to be updated to UTF-8. (IntelliJ IDEA locks .properties files as 
>> ISO-8859-1 unless manually changed).
>
> Justin Lu has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Replace InputStreamReader with BufferedReader

FWIW, I checked out the revision of the commit previous to this change and 
found the following:


% git checkout b55e418a077791b39992042411cde97f68dc39fe^ 
% find src -name "*.properties" | xargs file | grep -v ASCII
src/java.xml/share/classes/com/sun/org/apache/xml/internal/serializer/Encodings.properties:
 
  ISO-8859 text
src/java.xml.crypto/share/classes/com/sun/org/apache/xml/internal/security/resource/xmlsecurity_de.properties:
  Unicode text, UTF-8 text, with very long lines (322)


Which indicates that that this is the only non-ASCII, non-UTF-8 property file. 
So we may be lucky.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/15694#issuecomment-2792014164

Reply via email to