Re: RFR: 8354968: Replace unicode sequences in comment text with UTF-8 characters [v4]

Magnus Ihse Bursie Fri, 09 May 2025 08:21:25 -0700

> As part of the UTF-8 cleaning up done in 
> [JDK-8301971](https://bugs.openjdk.org/browse/JDK-8301971), I looked at where 
> and how we are using unicode sequences (`\uXXXX`). In several string 
> literals, I think the unicode sequences still has merit, if they improve 
> clarity or readability of the code. Some instances are more gray zone. But 
> the places where it does not make sense at all are in comments, as part of 
> fluid text comments. There they are just disruptive and not helpful at all. I 
> tried to locate all such places (but I might have missed places, I did not do 
> a proper lexical analysis to find comments) and fix them.
> 
> 99% of this fix is to turn poor `Peter von der Ah\u00e9` into `Peter von der 
> Ahé`. 😆 
> 
> I checked some random samples on when this was introduced to see if there 
> were some particular commit that mistreated the encoding, but they have been 
> there since the original release of the open JDK source code.
> 
> There are likely many more places where direct UTF-8 encoded characters is 
> preferable to unicode sequences, but this seemed like a safe and trivial 
> first start.


Magnus Ihse Bursie has updated the pull request incrementally with one 
additional commit since the last revision:

  Clarify non-ASCII characters with unicode code point

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/24727/files
  - new: https://git.openjdk.org/jdk/pull/24727/files/dd9a77c5..53aa4066

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=24727&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=24727&range=02-03

  Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/24727.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24727/head:pull/24727

PR: https://git.openjdk.org/jdk/pull/24727

Re: RFR: 8354968: Replace unicode sequences in comment text with UTF-8 characters [v4]

Reply via email to