On Mon, 26 Feb 2024 11:34:18 GMT, Claes Redestad <redes...@openjdk.org> wrote:
>> jdk.internal.reflect.UTF8 is used for encoding String to encoded UTF-8 when >> generating some classes. >> >> Since JDK 9 we have a fast-path (which avoids creating encoders) for >> UTF-8-encoding strings which is bootstrapped very early, so it seems safe to >> rewire this and remove the UTF8 helper class whose stated raison d'être is >> to avoid bootstrapping issues. >> >> This cleanup also removes a latent bug since the custom encoder isn't able >> to deal with classfile constants containing surrogate pairs. >> >> For a quick comparison I copied the UTF8 code to the `StringEncode` >> microbenchmark and set up a benchmark testing the same inputs as >> `encodeAllMixed`: >> >> >> Benchmark (charsetName) Mode Cnt >> Score Error Units >> StringEncode.encodeAllMixed UTF-8 avgt 10 >> 12894,551 ± 164,816 ns/op >> StringEncode.encodeUTF8InternalAllMixed UTF-8 avgt 10 >> 236614,548 ± 1445,975 ns/op >> >> >> I.e. `String.getBytes(UTF_8.instance)` is about 18x faster on mixed inputs. >> The benchmark is available in 595b464 but as a quick sanity check not >> intended for integration. >> >> Testing: tier1-3 > > Claes Redestad has updated the pull request incrementally with one additional > commit since the last revision: > > Remove temporary microbenchmark Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/18006#pullrequestreview-1901089641