On Sat, 4 May 2024 19:35:21 GMT, Scott Gibbons <[email protected]> wrote:
>> Re-write the IndexOf code without the use of the pcmpestri instruction, only
>> using AVX2 instructions. This change accelerates String.IndexOf on average
>> 1.3x for AVX2. The benchmark numbers:
>>
>>
>> Benchmark Score
>> Latest
>> StringIndexOf.advancedWithMediumSub 343.573 317.934
>> 0.925375393x
>> StringIndexOf.advancedWithShortSub1 1039.081 1053.96
>> 1.014319384x
>> StringIndexOf.advancedWithShortSub2 55.828 110.541
>> 1.980027943x
>> StringIndexOf.constantPattern 9.361 11.906
>> 1.271872663x
>> StringIndexOf.searchCharLongSuccess 4.216 4.218
>> 1.000474383x
>> StringIndexOf.searchCharMediumSuccess 3.133 3.216
>> 1.02649218x
>> StringIndexOf.searchCharShortSuccess 3.76 3.761
>> 1.000265957x
>> StringIndexOf.success 9.186
>> 9.713 1.057369911x
>> StringIndexOf.successBig 14.341 46.343
>> 3.231504079x
>> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52
>> 1.953814533x
>> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044
>> 1.006629895x
>> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689
>> 0.977049957x
>> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624
>> 0.967675646x
>> StringIndexOfChar.latin1_Short_String 7132.541
>> 6863.359 0.962260014x
>> StringIndexOfChar.latin1_Short_char 16013.389 16162.437
>> 1.009307711x
>> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622
>> 1.999915517x
>> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245
>> 0.987938803
>
> Scott Gibbons has updated the pull request incrementally with one additional
> commit since the last revision:
>
> Rearrange; add lambdas for clarity
First pass at StringIndexOfHuge.java and IndexOf.java
test/jdk/java/lang/StringBuffer/IndexOf.java line 40:
> 38: private static boolean failure = false;
> 39: public static void main(String[] args) throws Exception {
> 40: String testName = "IndexOf";
intentation
test/jdk/java/lang/StringBuffer/IndexOf.java line 47:
> 45: char[] haystack_16 = new char[128];
> 46:
> 47: for (int i = 0; i < 128; i++) {
you can use `char` instead of `int` as iterator
test/jdk/java/lang/StringBuffer/IndexOf.java line 54:
> 52: // for (int i = 1; i < 128; i++) {
> 53: // haystack_16[i] = (char) (i);
> 54: // }
dead code
test/jdk/java/lang/StringBuffer/IndexOf.java line 64:
> 62: Charset hs_charset = StandardCharsets.UTF_16;
> 63: Charset needleCharset = StandardCharsets.ISO_8859_1;
> 64: // Charset needleCharset = StandardCharsets.UTF_16;
Move from main() into a function that takes `needleCharset` as a parameter,
then call that function twice.
test/jdk/java/lang/StringBuffer/IndexOf.java line 81:
> 79: sourceBuffer = new StringBuffer(sourceString);
> 80: targetString = generateTestString(10, 11);
> 81: } while (sourceString.indexOf(targetString) != -1);
Should really keep the original test unmodified and add new tests as needed
test/jdk/java/lang/StringBuffer/IndexOf.java line 83:
> 81: shs = "$&),,18+-!'8)+";
> 82: endNeedle = "8)-";
> 83: l_offset = 9;
dead code
test/jdk/java/lang/StringBuffer/IndexOf.java line 89:
> 87: StringBuffer bshs = new StringBuffer(shs);
> 88:
> 89: // printStringBytes(shs.getBytes(hs_charset));
dead code (and next two comments)
test/jdk/java/lang/StringBuffer/IndexOf.java line 90:
> 88:
> 89: // printStringBytes(shs.getBytes(hs_charset));
> 90: for (int i = 0; i < 200000; i++) {
This wont be a deterministic way to reach the intrinsic. I would suggest
copying the idea from
test/jdk/com/sun/crypto/provider/Cipher/ChaCha20/unittest/Poly1305UnitTestDriver.java
i.e. Have two `@run main` invocations at the top of this file, one with default
parameters, one with `-Xcomp -XX:-TieredCompilation`. You dont need a 'driver'
program, that was to handle something else.
/*
* @test
* @modules java.base/com.sun.crypto.provider
* @run main java.base/com.sun.crypto.provider.Poly1305KAT
* @summary Unit test for com.sun.crypto.provider.Poly1305.
*/
/*
* @test
* @modules java.base/com.sun.crypto.provider
* @summary Unit test for IntrinsicCandidate in
com.sun.crypto.provider.Poly1305.
* @run main/othervm -Xcomp -XX:-TieredCompilation
-XX:+UnlockDiagnosticVMOptions -XX:+ForceUnreachable
java.base/com.sun.crypto.provider.Poly1305KAT
*/
test/jdk/java/lang/StringBuffer/IndexOf.java line 126:
> 124: int aNewLength = getRandomIndex(min, max);
> 125: for (int y = 0; y < aNewLength; y++) {
> 126: int achar = generator.nextInt(30) + 30;
This will only ever generate LL cases, i.e. chars from [30,60]. Could be
parametrized to also produce utf16 if instead of 30, offset was in the unicode
range
test/jdk/java/lang/StringBuffer/IndexOf.java line 199:
> 197:
> System.out.println("Source="+sourceString.substring(hsBegin, hsBegin +
> haystackLen));
> 198:
> System.out.println("Target="+targetString.substring(nBegin, nBegin +
> needleLen));
> 199: System.out.println("haystackLen="+haystackLen+"
> neeldeLen="+needleLen+" hsBegin="+hsBegin+" nBegin="+nBegin+
This looks like 'development scaffolding' (i.e. printf debugging) that was
meant to be removed
test/jdk/java/lang/StringBuffer/IndexOf.java line 237:
> 235: + sourceBuffer.toString() + " len Buffer = " +
> sourceBuffer.toString().length());
> 236: System.err.println(" naive = " +
> naiveFind(sourceBuffer.toString(), targetString, 0) + ", IndexOf = "
> 237: + sourceBuffer.indexOf(targetString));
More tracing left behind here and rest of this function (original just recorded
failure and moved along)
test/jdk/java/lang/StringBuffer/IndexOf.java line 284:
> 282:
> 283: // Note: it is possible although highly improbable that failCount will
> 284: // be > 0 even if everthing is working ok
This sounds like either a bug or a testcase bug? Same as line 301, `extremely
remote possibility of > 1 match`?
test/jdk/java/lang/StringBuffer/IndexOf.java line 295:
> 293: sourceString = generateTestString(99, 100);
> 294: sourceBuffer = new StringBuffer(sourceString);
> 295: targetString = generateTestString(10, 11);
Generate a random int [0,1,2] for LL, UU, UL, pass that as parameter to
generateTestString() to test the other paths. Same for other tests in this file
using this pattern.
This test is specific to haystacklen=100, needlelen=10.. what about other
haystack/needle sizes to exercise all the paths in the intrinsic assembler
(i.e. haystack >=, <=32, needlelen ={1,2,3,4,5..32..}). Elsewhere already?
test/jdk/java/lang/StringBuffer/IndexOf.java line 360:
> 358: System.err.println(" sAnswer = " + sAnswer + ", sbAnswer = " +
> sbAnswer);
> 359: System.err.println(" testString = '" + testString + "'");
> 360: System.err.println(" testBuffer = '" + testBuffer + "'");
tracing left here and further down
test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 2:
> 1: /*
> 2: * Copyright (c) 2014, 2024, Oracle and/or its affiliates. All rights
> reserved.
New file, just 2024
test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 81:
> 79: lateMatchString16 =
> dataStringHuge16.substring(dataStringHuge16.length() - 31);
> 80:
> 81: searchString = "oscar";
Would had liked to see a few more small needles (i.e. to test/verify individual
switch statement cases)
test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 94:
> 92:
> 93:
> 94: /** IndexOf Micros */
Would really had preferred @Param{"LL", "UU", "UL"}; would be easier to spot if
there are any copy/paste errors..
test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 132:
> 130: @Benchmark
> 131: public int searchHugeLargeSubstring() {
> 132: return dataStringHuge.indexOf("B".repeat(30) + "X" +
> "A".repeat(30), 74);
.repeat() call and string concatenation shouldn't be part of the benchmark
(here and several other @Benchmark functions in this file) since it will
detract from the measurement.
(String concatenation gets converted (by javac) into
StringBuilder().append().append()....append().toString())
test/micro/org/openjdk/bench/java/lang/StringIndexOfHuge.java line 242:
> 240: @Benchmark
> 241: public int search16HugeLargeSubstring16() {
> 242: return dataStringHuge16.indexOf("B".repeat(30) + "X" +
> "A".repeat(30), 74);
`search16HugeLargeSubstring16` implies UU, but with `"B".repeat(30) + "X" +
"A".repeat(30)` is UL
-------------
PR Review: https://git.openjdk.org/jdk/pull/16753#pullrequestreview-2058681000
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602136400
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602140456
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602137044
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602158011
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602160330
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602144091
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602147967
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602153043
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602181943
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602162587
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602167728
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602184697
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602198158
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602171418
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602200123
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602133525
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602130679
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602047091
PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1602115797