> jdk.internal.reflect.UTF8 is used for encoding String to encoded UTF-8 when 
> generating some classes. 
> 
> Since JDK 9 we have a fast-path (which avoids creating encoders) for 
> UTF-8-encoding strings which is bootstrapped very early, so it seems safe to 
> rewire this and remove the UTF8 helper class whose stated raison d'être is to 
> avoid bootstrapping issues.
> 
> This cleanup also removes a latent bug since the custom encoder isn't able to 
> deal with classfile constants containing surrogate pairs.
> 
> For a quick comparison I copied the UTF8 code to the `StringEncode` 
> microbenchmark and set up a benchmark testing the same inputs as 
> `encodeAllMixed`:
> 
> 
> Benchmark                                (charsetName)  Mode  Cnt       Score 
>      Error  Units
> StringEncode.encodeAllMixed                      UTF-8  avgt   10   12894,551 
> ±  164,816  ns/op
> StringEncode.encodeUTF8InternalAllMixed          UTF-8  avgt   10  236614,548 
> ± 1445,975  ns/op
> 
> 
> I.e. `String.getBytes(UTF_8.instance)` is about 18x faster on mixed inputs. 
> (I plan on removing `encodeUTF8InternalAllMixed` from the PR before merging, 
> but wanted to include it initially to show what I've measured.)

Claes Redestad has updated the pull request incrementally with one additional 
commit since the last revision:

  Remove temporary microbenchmark

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/18006/files
  - new: https://git.openjdk.org/jdk/pull/18006/files/595b464d..1293b167

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=18006&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18006&range=00-01

  Stats: 62 lines in 1 file changed: 0 ins; 62 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/18006.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18006/head:pull/18006

PR: https://git.openjdk.org/jdk/pull/18006

Reply via email to