Re: RFR: 8354968: Replace unicode sequences in comment text with UTF-8 characters [v4]

Magnus Ihse Bursie Fri, 09 May 2025 03:49:02 -0700

On Fri, 9 May 2025 10:12:09 GMT, Magnus Ihse Bursie <[email protected]> wrote:


>> As part of the UTF-8 cleaning up done in 
>> [JDK-8301971](https://bugs.openjdk.org/browse/JDK-8301971), I looked at 
>> where and how we are using unicode sequences (`\uXXXX`). In several string 
>> literals, I think the unicode sequences still has merit, if they improve 
>> clarity or readability of the code. Some instances are more gray zone. But 
>> the places where it does not make sense at all are in comments, as part of 
>> fluid text comments. There they are just disruptive and not helpful at all. 
>> I tried to locate all such places (but I might have missed places, I did not 
>> do a proper lexical analysis to find comments) and fix them.
>> 
>> 99% of this fix is to turn poor `Peter von der Ah\u00e9` into `Peter von der 
>> Ahé`. 😆 
>> 
>> I checked some random samples on when this was introduced to see if there 
>> were some particular commit that mistreated the encoding, but they have been 
>> there since the original release of the open JDK source code.
>> 
>> There are likely many more places where direct UTF-8 encoded characters is 
>> preferable to unicode sequences, but this seemed like a safe and trivial 
>> first start.
>
> Magnus Ihse Bursie has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Clarify non-ASCII characters with unicode code point

Now that [JDK-8301971](https://bugs.openjdk.org/browse/JDK-8301971) is 
committed, this PR is ready to be integrated.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24727#issuecomment-2866080024

Re: RFR: 8354968: Replace unicode sequences in comment text with UTF-8 characters [v4]

Reply via email to