On Wed, 15 Mar 2023 11:58:14 GMT, Claes Redestad <redes...@openjdk.org> wrote:
>> By avoiding a bit shift operation for the latin1 fast-path test, we can >> speed up the `java.lang.CharacterData.of` method by ~25% for latin1 code >> points. >> >> The latin1 test is currently implemented as `ch >>> 8 == 0`. We can replace >> this with `ch >= 0 && ch <= 0xFF` for a noticable performance gain >> (especially for Latin1 code points): >> >> This method is called frequently by various property-determining methods in >> `java.lang.Character` like `isLowerCase`, `isDigit` etc, so one should >> expect improvements for all these methods. >> >> Performance is tested using the `Characters.isDigit` benchmark using the >> digits '0' (decimal 48, in CharacterDataLatin1) and '\u0660' (decimal 1632, >> in CharacterData00): >> >> Baseline: >> >> >> Benchmark (codePoint) Mode Cnt Score Error Units >> Characters.isDigit 48 avgt 15 0.870 ± 0.011 ns/op >> Characters.isDigit 1632 avgt 15 2.168 ± 0.017 ns/op >> >> PR: >> >> >> Benchmark (codePoint) Mode Cnt Score Error Units >> Characters.isDigit 48 avgt 15 0.654 ± 0.007 ns/op >> Characters.isDigit 1632 avgt 15 2.032 ± 0.019 ns/op > > src/java.base/share/classes/java/lang/CharacterData.java line 72: > >> 70: >> 71: static final CharacterData of(int ch) { >> 72: if (ch >= 0 && ch <= 0xFF) { // fast-path > > Maybereducing to a single branch with a mask op helps further? Or maybe the > JIT already effectively does that: > > `if (ch && 0xFFFFFF00 == 0) {` Btw, I think we can do the same for `StringLatin1.canEncode()` ------------- PR: https://git.openjdk.org/jdk/pull/13040