On Wed, 15 Mar 2023 14:23:35 GMT, Eirik Bjorsnos <d...@openjdk.org> wrote:

>> Many thanks to have tried, yep, I was curious indeed re the 
>> "StringLatin1.canEncode regression" case.
>> I would still modify the benchmark to use inputs (I know that will make it 
>> memory bound sadly, due to reading inputs - but the size of such inputs can 
>> be a benchmark parameter, together with the bias eg "latin","mix", 
>> "non-latin") "semi-randomly" generated based on the mentioned 
>> strategies/biases. 
>> It will benefit future tests on this, although could be provided as a 
>> separate PR.
>
>> The StringLatin1.canEncode regression disappears.
> 
> I mixed things up so StringLatin1.canEncode was benchmarked without the 
> updated code.
> 
> Here are updated benchmark results:
> 
> 
> Baseline:
> 
> 
> Benchmark                 (codePoint)  Mode  Cnt  Score   Error  Units
> Characters.isDigitRandom         1632  avgt   15  5.437 ± 0.235  ns/op
> 
> 
> PR:
> 
> 
> Benchmark                 (codePoint)  Mode  Cnt  Score   Error  Units
> Characters.isDigitRandom         1632  avgt   15  5.319 ± 0.341  ns/op
> 
> 
> StringLatin1.canEncode:
> 
> 
> Benchmark                 (codePoint)  Mode  Cnt  Score   Error  Units
> Characters.isDigitRandom         1632  avgt   15  5.447 ± 0.304  ns/op
> ``` 
> 
> So it seems using StringLatin1.canEncode still might have a regression also 
> in the randomized input case.
> 
> For this PR, I suggest we update StringLatin1.canEncode to be in sync with 
> CharacterData.of, without one calling the other. If anyone wants to 
> investigate the regression further, than can be done outside this PR.
> 
> I have independently verified that StringLatin1.canEncode sees performance 
> improvements using the StringIndexOf benchmark.

We can do `Integer.compareUnsigned(ch, 0xFF) <= 0`

-------------

PR: https://git.openjdk.org/jdk/pull/13040

Reply via email to