On Mon, 26 Feb 2024 11:05:10 GMT, Claes Redestad <redes...@openjdk.org> wrote:
> jdk.internal.reflect.UTF8 is used for encoding String to encoded UTF-8 when > generating some classes. > > Since JDK 9 we have a fast-path (which avoids creating encoders) for > UTF-8-encoding strings which is bootstrapped very early, so it seems safe to > rewire this and remove the UTF8 helper class whose stated raison d'être is to > avoid bootstrapping issues. > > This cleanup also removes a latent bug since the custom encoder isn't able to > deal with classfile constants containing surrogate pairs. > > For a quick comparison I copied the UTF8 code to the `StringEncode` > microbenchmark and set up a benchmark testing the same inputs as > `encodeAllMixed`: > > > Benchmark (charsetName) Mode Cnt Score > Error Units > StringEncode.encodeAllMixed UTF-8 avgt 10 12894,551 > ± 164,816 ns/op > StringEncode.encodeUTF8InternalAllMixed UTF-8 avgt 10 236614,548 > ± 1445,975 ns/op > > > I.e. `String.getBytes(UTF_8.instance)` is about 18x faster on mixed inputs. > The benchmark is available in 595b464 but as a quick sanity check not > intended for integration. > > Testing: tier1-3 This pull request has now been integrated. Changeset: c042f086 Author: Claes Redestad <redes...@openjdk.org> URL: https://git.openjdk.org/jdk/commit/c042f0863247633e98ace9757fb8531145286e66 Stats: 81 lines in 2 files changed: 2 ins; 78 del; 1 mod 8326653: Remove jdk.internal.reflect.UTF8 Reviewed-by: rriggs, alanb ------------- PR: https://git.openjdk.org/jdk/pull/18006