On Mon, 8 Jan 2024 20:48:39 GMT, Scott Gibbons <sgibb...@openjdk.org> wrote:

>> Re-write the IndexOf code without the use of the pcmpestri instruction, only 
>> using AVX2 instructions.  This change accelerates String.IndexOf on average 
>> 1.3x for AVX2.  The benchmark numbers:
>> 
>> 
>> Benchmark                                                   Score            
>> Latest          
>> StringIndexOf.advancedWithMediumSub   343.573                317.934         
>> 0.925375393x
>> StringIndexOf.advancedWithShortSub1    1039.081              1053.96         
>> 1.014319384x
>> StringIndexOf.advancedWithShortSub2        55.828            110.541         
>> 1.980027943x
>> StringIndexOf.constantPattern                        9.361           11.906  
>>         1.271872663x
>> StringIndexOf.searchCharLongSuccess          4.216           4.218           
>> 1.000474383x
>> StringIndexOf.searchCharMediumSuccess        3.133           3.216           
>> 1.02649218x
>> StringIndexOf.searchCharShortSuccess 3.76                    3.761           
>> 1.000265957x
>> StringIndexOf.success                                        9.186           
>> 9.713           1.057369911x
>> StringIndexOf.successBig                           14.341            46.343  
>>         3.231504079x
>> StringIndexOfChar.latin1_AVX2_String   6220.918              12154.52        
>>         1.953814533x
>> StringIndexOfChar.latin1_AVX2_char     5503.556              5540.044        
>>         1.006629895x
>> StringIndexOfChar.latin1_SSE4_String   6978.854              6818.689        
>>         0.977049957x
>> StringIndexOfChar.latin1_SSE4_char     5657.499              5474.624        
>>         0.967675646x
>> StringIndexOfChar.latin1_Short_String          7132.541              
>> 6863.359                0.962260014x
>> StringIndexOfChar.latin1_Short_char  16013.389             16162.437         
>> 1.009307711x
>> StringIndexOfChar.latin1_mixed_String          7386.123            14771.622 
>>         1.999915517x
>> StringIndexOfChar.latin1_mixed_char    9901.671              9782.245        
>>         0.987938803
>
> Scott Gibbons has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 21 commits:
> 
>  - Merge branch 'openjdk:master' into indexof
>  - Addressing review comments.
>  - Fix for JDK-8321599
>  - Support UU IndexOf
>  - Only use optimization when EnableX86ECoreOpts is true
>  - Fix whitespace
>  - Merge branch 'openjdk:master' into indexof
>  - Comments; added exhaustive-ish test
>  - Subtracting 0x10 twice.
>  - Stomped on r13 in switch branch calculation
>  - ... and 11 more: https://git.openjdk.org/jdk/compare/8a4dc79e...600377b0

@asgibbons I cannot yet promise to review this, I just left a few comments 
after scrolling through this change. I'm especially scared of reviewing 
`src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp`.

I launched testing for Commit 21 / v05.

Maybe an ignorant question: How would avx512 be affected?

src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 83:

> 81: 
> 82: //   const __m256i first = _mm256_set1_epi8(needle[0]);
> 83: //   const __m256i last = _mm256_set1_epi8(needle[k - 1]);

I think it would be nicer if you had comment `//` on every line, and no gaps.

src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1608:

> 1606:     //   vector compares when size is 2 * VEC_SIZE or less. 38    8. 
> Use 4
> 1607:     //   vector compares when size is 4 * VEC_SIZE or less. 39    9. 
> Use 8
> 1608:     //   vector compares when size is 8 * VEC_SIZE or less.  */

Is this formatting intended?

src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1672:

> 1670: 
> 1671:     // 98              VPCMPEQ VEC_SIZE(%rdi), %ymm2, %ymm2
> 1672:     // 99              vpmovmskb %ymm2, %eax

It seems that here the comments and code is strangely interleaved / shifted. 
What is this all for?

src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 2301:

> 2299:     // 388             setg    %dl
> 2300:     // 389             leal    -1(%rdx, %rdx), %eax
> 2301:     __ movzbl(rcx, Address(rsi, rax, Address::times_1, -0x20));

Down here it is even worse

test/jdk/java/lang/StringBuffer/IndexOf.java line 34:

> 32: public class IndexOf {
> 33: 
> 34:   static Random generator = new Random(1999);

Would it be an alternative to use the this:

import jdk.test.lib.Utils;
...
Random random = Utils.getRandomInstance();

This has a random seed, but it is always printed in the output and can be set 
via a test-flag.

test/jdk/java/lang/StringBuffer/IndexOf.java line 44:

> 42:     }
> 43:     System.out.println("");
> 44:     generator.setSeed(1999);

Is there a good reason for a fixed seed?

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/16753#pullrequestreview-1811353722
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446211178
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446210544
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446216019
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446217305
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446221928
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1446223038

Reply via email to