Re: RFR: 8304245: Speed up CharacterData.of by avoiding bit shifting in the latin1 fast-path test

Eirik Bjorsnos Wed, 15 Mar 2023 06:46:17 -0700

On Wed, 15 Mar 2023 12:37:24 GMT, Francesco Nigro <d...@openjdk.org> wrote:


>>> It seems reasonable to keep these two in sync, yes. (`CharacterData.of` 
>>> could even call into `StringLatin1.canEncode`, unless that's cause for some 
>>> performance anomaly)
>> 
>> If I update `StringLatin1.canEncode` and call into that from 
>> `CharacterData.of`, I observe no regression for the Latin1 case, but a 
>> significant regression for the non-Latin1 case. I have no idea how to 
>> explain that:
>> 
>> 
>> Benchmark           (codePoint)  Mode  Cnt  Score   Error  Units
>> Characters.isDigit           48  avgt   15  0.675 ± 0.029  ns/op
>> Characters.isDigit         1632  avgt   15  2.435 ± 0.032  ns/op
>
> Can you check what happen adding much more inputs to the dataset including 
> non-latin chars as well and use `-prof perfnorm` to check what `perf` report 
> re branches/branch-misses?
> 
> You can use `SplittableRandom` to pre-populate an array of inputs which 
> sequence is "random" but still allow deterministic benchmarking and feed the 
> benchmark method by cycling the pre-computed inputs.
> In the real world I expect `isDigit` to happen on different input types and 
> both having C2 with both branches places based on prev inputs distribution 
> and a confused branch-predictor to allow comparing vs something that looks a 
> bit nearest to the real world (TBD, I know).
> I expect in that case that a single cmp + mask to work better depending on 
> latin input distribution/occurrence

I created a randomized version of `Characters.isDigit` which tests with code 
points picked at random such that any category (Latin1, negative, different 
planes, unassiged) are equally probable.

Baseline:


Benchmark                 (codePoint)  Mode  Cnt  Score   Error  Units
Characters.isDigitRandom         1632  avgt   15  5.503 ± 0.371  ns/op


Current PR:


Benchmark                 (codePoint)  Mode  Cnt  Score   Error  Units
Characters.isDigitRandom         1632  avgt   15  5.393 ± 0.336  ns/op


Using StringLatin1.canEncode:


Benchmark                 (codePoint)  Mode  Cnt  Score   Error  Units
Characters.isDigitRandom         1632  avgt   15  5.377 ± 0.322  ns/op


Seems the PR still has a small improvement for this scenario. The 
StringLatin1.canEncode regression disappears.

In the real world ASCII/Latin1 seems to dominate most data, so this scenario is 
perhaps not very realistic.

I'm running this on a Mac, so cannot try `-prof perfnorm`.

-------------

PR: https://git.openjdk.org/jdk/pull/13040

Re: RFR: 8304245: Speed up CharacterData.of by avoiding bit shifting in the latin1 fast-path test

Reply via email to