Integrated: 8354968: Replace unicode sequences in comment text with UTF-8 characters

Magnus Ihse Bursie Tue, 13 May 2025 23:47:22 -0700

On Thu, 17 Apr 2025 14:42:37 GMT, Magnus Ihse Bursie <[email protected]> wrote:


> As part of the UTF-8 cleaning up done in 
> [JDK-8301971](https://bugs.openjdk.org/browse/JDK-8301971), I looked at where 
> and how we are using unicode sequences (`\uXXXX`). In several string 
> literals, I think the unicode sequences still has merit, if they improve 
> clarity or readability of the code. Some instances are more gray zone. But 
> the places where it does not make sense at all are in comments, as part of 
> fluid text comments. There they are just disruptive and not helpful at all. I 
> tried to locate all such places (but I might have missed places, I did not do 
> a proper lexical analysis to find comments) and fix them.
> 
> 99% of this fix is to turn poor `Peter von der Ah\u00e9` into `Peter von der 
> Ahé`. 😆 
> 
> I checked some random samples on when this was introduced to see if there 
> were some particular commit that mistreated the encoding, but they have been 
> there since the original release of the open JDK source code.
> 
> There are likely many more places where direct UTF-8 encoded characters is 
> preferable to unicode sequences, but this seemed like a safe and trivial 
> first start.

This pull request has now been integrated.

Changeset: a3e094e1
Author:    Magnus Ihse Bursie <[email protected]>
URL:       
https://git.openjdk.org/jdk/commit/a3e094e1a0716adf52dad6407eb7877682beec92
Stats:     158 lines in 153 files changed: 0 ins; 2 del; 156 mod

8354968: Replace unicode sequences in comment text with UTF-8 characters

Reviewed-by: naoto

-------------

PR: https://git.openjdk.org/jdk/pull/24727

Integrated: 8354968: Replace unicode sequences in comment text with UTF-8 characters

Reply via email to