> jdk.internal.reflect.UTF8 is used for encoding String to encoded UTF-8 when > generating some classes. > > Since JDK 9 we have a fast-path (which avoids creating encoders) for > UTF-8-encoding strings which is bootstrapped very early, so it seems safe to > rewire this and remove the UTF8 helper class whose stated raison d'être is to > avoid bootstrapping issues. > > This cleanup also removes a latent bug since the custom encoder isn't able to > deal with classfile constants containing surrogate pairs. > > For a quick comparison I copied the UTF8 code to the `StringEncode` > microbenchmark and set up a benchmark testing the same inputs as > `encodeAllMixed`: > > > Benchmark (charsetName) Mode Cnt Score > Error Units > StringEncode.encodeAllMixed UTF-8 avgt 10 12894,551 > ± 164,816 ns/op > StringEncode.encodeUTF8InternalAllMixed UTF-8 avgt 10 236614,548 > ± 1445,975 ns/op > > > I.e. `String.getBytes(UTF_8.instance)` is about 18x faster on mixed inputs. > (I plan on removing `encodeUTF8InternalAllMixed` from the PR before merging, > but wanted to include it initially to show what I've measured.)
Claes Redestad has updated the pull request incrementally with one additional commit since the last revision: Remove temporary microbenchmark ------------- Changes: - all: https://git.openjdk.org/jdk/pull/18006/files - new: https://git.openjdk.org/jdk/pull/18006/files/595b464d..1293b167 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=18006&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=18006&range=00-01 Stats: 62 lines in 1 file changed: 0 ins; 62 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/18006.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/18006/head:pull/18006 PR: https://git.openjdk.org/jdk/pull/18006