On Tue, 7 Feb 2023 20:32:11 GMT, Claes Redestad <redes...@openjdk.org> wrote:

>> src/java.base/share/classes/java/lang/String.java line 698:
>> 
>>> 696:     }
>>> 697: 
>>> 698:     static byte[] copyBytes(byte[] bytes, int offset, int length) {
>> 
>> Given that the stub generated for array copy seems highly dependent by the 
>> call site constrains, did you tried adding a check for offset == 0 and/or 
>> length == bytes.length?
>> 
>> If (offset == 0 && bytes.length == length) {
>>     System.arrayCopy(bytes, 0, dst, 0, bytes.length);
>>     // etc etc the other combinations 
>> 
>> This should have different generated stubs with much smaller ASM depending 
>> by the enforced constrains (and shouldn't affect terribly the code size of 
>> the method, given that the stub won't be inlined AFAIK)
>> 
>> Beware, as noted by others, I'm not suggesting that's the way to fix this, 
>> but it would be interesting to check how much perf we leave on the ground 
>> due to the this supposed "inefficient" stub generation (if that's the issue).
>
> I did some quick experiments but saw no clear win from doing anything like 
> this here. Feel free to experiment and see if there's some particular 
> configuration that comes out ahead.
> 
> FTR I did not intend for this RFE to solve 
> https://bugs.openjdk.org/browse/JDK-8295496 completely, but provide a small, 
> partial win that might possibly clear a path to solving that likely 
> orthogonal issue.

I've created a separate benchmark for this (named as your by accident - given 
that I've used it as a blueprint):
https://gist.github.com/franz1981/658c2bf6796aab4ae04a84bef1ef34b6
results are

Benchmark                             (offset)  (size)  Mode  Cnt   Score   
Error  Units
StringConstructor.arrayCopy                  0       7  avgt   10   9.519 ± 
0.131  ns/op
StringConstructor.arrayCopy                  1       7  avgt   10   9.194 ± 
0.232  ns/op
StringConstructor.copyOf                     0       7  avgt   10  11.548 ± 
0.133  ns/op
StringConstructor.copyOf                     1       7  avgt   10   9.812 ± 
0.018  ns/op
StringConstructor.optimizedArrayCopy         0       7  avgt   10   6.854 ± 
0.355  ns/op    <---- THAT'S COOL
StringConstructor.optimizedArrayCopy         1       7  avgt   10   9.088 ± 
0.049  ns/op

the optimized array copy is helping C2 on stub generation.
I didn't checked yet if this applies to the `String` case and I didn't created 
a long enough dataset array to check the effects on the branch predictor with 
the newly introduced conditions too, but in term of generated stub, there's a 
difference.

-------------

PR: https://git.openjdk.org/jdk/pull/12453

Reply via email to