Re: RFR: 8304245: Speed up CharacterData.of by avoiding bit shifting in the latin1 fast-path test

Francesco Nigro Wed, 15 Mar 2023 05:40:59 -0700

On Wed, 15 Mar 2023 12:28:05 GMT, Eirik Bjorsnos <d...@openjdk.org> wrote:


>>> `if (ch && 0xFFFFFF00 == 0) {`
>> 
>> This seems to perform similar to baseline:
>> 
>> 
>> Benchmark           (codePoint)  Mode  Cnt  Score   Error  Units
>> Characters.isDigit           48  avgt   15  0.890 ± 0.025  ns/op
>> Characters.isDigit         1632  avgt   15  2.174 ± 0.011  ns/op
>> 
>> 
>> Would be interesting to check the performance on non-Intel architectures. If 
>> you want to give it a spin on your M1, here's the benchmark command I used:
>> 
>> `make test TEST='micro:java.lang.Characters.isDigit' MICRO="OPTIONS=-p 
>> codePoint=48,1632"`
>
>> It seems reasonable to keep these two in sync, yes. (`CharacterData.of` 
>> could even call into `StringLatin1.canEncode`, unless that's cause for some 
>> performance anomaly)
> 
> If I update `StringLatin1.canEncode` and call into that from 
> `CharacterData.of`, I observe no regression for the Latin1 case, but a 
> significant regression for the non-Latin1 case. I have no idea how to explain 
> that:
> 
> 
> Benchmark           (codePoint)  Mode  Cnt  Score   Error  Units
> Characters.isDigit           48  avgt   15  0.675 ± 0.029  ns/op
> Characters.isDigit         1632  avgt   15  2.435 ± 0.032  ns/op

Can you check what happen adding much more inputs to the dataset that includes 
non-latin chars as well and use `-prof perfnorm` to check what `perf` report re 
branches/branch-misses?

You can use SplittableRandom to pre-populate an array of inputs which sequence 
is "random" but still allow deterministic benchmarking and feed the benchmark 
method cycling the pre-computed inputs

-------------

PR: https://git.openjdk.org/jdk/pull/13040

Re: RFR: 8304245: Speed up CharacterData.of by avoiding bit shifting in the latin1 fast-path test

Reply via email to