On Fri, 2 Feb 2024 20:09:57 GMT, Srinivas Vamsi Parasa <d...@openjdk.org> wrote:
>> Hi Vamsi (@vamsi-parasa), Laurent(@bourgesl), >> >> The latest benchmarking compares compares the following versions: >> jdk - direct call of Arrays.sort(); >> a15 - the current source of DualPivotQuicksort from the latest build (except >> renaming) >> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/DualPivotQuicksort.java >> https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_a15.java >> r20s - new version without Radix sort >> r20p - new version with Radix sort in parallel case only >> >> It is expected that timing of jdk and a15 should be more or less the same, >> but please look at the results: >> >> Benchmark | Data Type | Array Size | Arrays.sort() from jdk | Current source >> (a15) >> -- | -- | -- | -- | -- >> ArraysSort.Int.testSort | RANDOM | 600 | 1.612 | 7.096 >> ArraysSort.Int.testSort | RANDOM | 2000 | 6.893 | 44.014 >> ArraysSort.Int.testSort | RANDOM | 90000 | 522.749 | 4451.444 >> ArraysSort.Int.testSort | RANDOM | 400000 | 2424.204 | 22751.966 >> ArraysSort.Int.testSort | RANDOM | 3000000 | 21000.434 | 190326.306 >> ArraysSort.Int.testSort | REPEATED | 600 | 0.496 | 1.044 >> ArraysSort.Int.testSort | REPEATED | 2000 | 1.037 | 2.272 >> ArraysSort.Int.testSort | REPEATED | 90000 | 57.763 | 412.331 >> ArraysSort.Int.testSort | REPEATED | 400000 | 182.238 | 1908.978 >> ArraysSort.Int.testSort | REPEATED | 3000000 | 1708.082 | 15163.443 >> ArraysSort.Int.testSort | STAGGER | 600 | 1.038 | 1.055 >> ArraysSort.Int.testSort | STAGGER | 2000 | 3.434 | 3.408 >> ArraysSort.Int.testSort | STAGGER | 90000 | 148.638 | 149.220 >> ArraysSort.Int.testSort | STAGGER | 400000 | 663.076 | 663.096 >> ArraysSort.Int.testSort | STAGGER | 3000000 | 5212.821 | 5206.890 >> ArraysSort.Int.testSort | SHUFFLE | 600 | 1.926 | 4.611 >> ArraysSort.Int.testSort | SHUFFLE | 2000 | 6.858 | 17.955 >> ArraysSort.Int.testSort | SHUFFLE | 90000 | 473.441 | 1410.357 >> ArraysSort.Int.testSort | SHUFFLE | 400000 | 2153.779 | 5739.311 >> ArraysSort.Int.testSort | SHUFFLE | 3000000 | 18180.141 | 41501.980 >> >> You can see that a15 (current source) works extremly slower than >> Arrays.sort(), but the code is the same >> with minor differences: renaming and repackaging (I put the class to the >> test package org.openjdk.bench.java.util). >> How does other package org.openjdk.bench.java.util effect so much? >> >> I use this pom.xml file https://github.com/iarosla... > > Hi Vladimir (@iaroslavski), > > Please see the data below. All tests were run after putting the DPQS code in > java.util package and recompiling the JDK for each case. > > <html xmlns:v="urn:schemas-microsoft-com:vml" > xmlns:o="urn:schemas-microsoft-com:office:office" > xmlns:x="urn:schemas-microsoft-com:office:excel" > xmlns="http://www.w3.org/TR/REC-html40"> > > <head> > > <meta name=ProgId content=Excel.Sheet> > <meta name=Generator content="Microsoft Excel 15"> > <link id=Main-File rel=Main-File > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip.htm"> > <link rel=File-List > href="file:///C:/Users/sparasa/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml"> > > > > </head> > > <body link="#0563C1" vlink="#954F72"> > > > Benchmark (us/op) | (builder) | (size) | Stock JDK | a15 | r20p | r20s > -- | -- | -- | -- | -- | -- | -- > ArraysSort.Int.testParallelSort | RANDOM | 600 | 2.24 | 2.201 | 2.423 | 2.389 > ArraysSort.Int.testParallelSort | RANDOM | 9000 | 35.318 | 35.961 | 79.028 | > 83.774 > ArraysSort.Int.testParallelSort | RANDOM | 20000 | 118.729 | 120.872 | > 134.829 | 138.349 > ArraysSort.Int.testParallelSort | RANDOM | 400000 | 822.676 | 822.44 | > 1200.858 | 872.264 > ArraysSort.Int.testParallelSort | RANDOM | 3000000 | 5864.514 | 5948.82 | > 8800.391 | 6020.616 > ArraysSort.Int.testParallelSort | REPEATED | 600 | 0.924 | 0.936 | 0.752 | > 0.733 > ArraysSort.Int.testParallelSort | REPEATED | 9000 | 9.896 | 9.317 | 31.409 | > 24.896 > ArraysSort.Int.testParallelSort | REPEATED | 20000 | 58.265 | 42.189 | 40.92 > | 40.101 > ArraysSort.Int.testParallelSort | REPEATED | 400000 | 256.952 | 253.217 | > 236.568 | 239.163 > ArraysSort.Int.testParallelSort | REPEATED | 3000000 | 2844.107 | 2851.088 | > 2752.939 | 3040.423 > ArraysSort.Int.testParallelSort | STAGGER | 600 | 2.245 | 2.296 | 2.15 | 2.219 > ArraysSort.Int.testParallelSort | STAGGER | 9000 | 29.278 | 29.119 | 28.288 | > 28.141 > ArraysSort.Int.testParallelSort | STAGGER | 20000 | 50.129 | 50.442 | 49.746 > | 49.686 > ArraysSort.Int.testParallelSort | STAGGER | 400000 | 463.309 | 413.619 | > 418.077 | 407.519 > ArraysSort.Int.testParallelSort | STAGGER | 3000000 | 3687.198 | 4363.242 | > 3732.777 | 3769.898 > ArraysSort.Int.testParallelSort | SHUFFLE | 600 | 1.715 | 1.698 | 2.799 | > 2.733 > ArraysSort.Int.testParallelSort | SHUFFLE | 9000 | 27.69 | 27.183 | 32.883 | > 32.373 > ArraysSort.Int.testParallelSort | SHUFFLE | 20000 | 62.067 | 60.987 | 63.281 > | 52.89 > ArraysSort.Int.testParallelSort | SHUFFLE | 400000 | 467.213 | 495.14 | > 650.173 | 476.403 > ArraysSort.Int.testParallelS... Hello Vamsi (@vamsi-parasa), Many thanks for the results! Now we can see that intrinsics are applied in all cases, but there are big differences between the same code. For example, parallelSort REPEATED 20000: 58.265(Stock JDK) and 42.189(a15) with speedup x1.38 parallelSort STAGGER 3000000: 3687.198(Stock JDK) 4363.242(a15) with speedup x0.85 Case a15 is the current source code from JDK, but in one benchmarking it is faster, in other benchmarking it is slower (~15-30%). Other strange behaviour with new sorting: r20p and r20s have the same code for sequential sorting (no radix sort at all), but we can see that on case works much slower sort STAGGER 3000000: 34406.74(r20p) and 10467.03(r20s) - 3.3 times slower, whereas other sizes show more or less equal values. Vamsi (@vamsi-parasa), Could you please run benchmarking of 5 cases with **updated** test class **ArraysSortNew**? https://github.com/iaroslavski/sorting/blob/master/radixsort/ArraysSortNew.java Put the DPQS code in java.util package and recompiling the JDK for each case as you did before, but run new ArraysSortNew. Find the sources there: https://github.com/iaroslavski/sorting/blob/master/radixsort/ArraysSortNew.java https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_jdk.java https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r20p.java https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r20s.java https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r25p.java https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r25s.java Thank you, Vladimir ------------- PR Comment: https://git.openjdk.org/jdk/pull/13568#issuecomment-1928130680