On Fri, 26 Jan 2024 17:19:25 GMT, Srinivas Vamsi Parasa <d...@openjdk.org> 
wrote:

>> Hello Vamsi (@vamsi-parasa),
>> 
>> Could you please run the benchmarking of new DQPS in your environment with 
>> AVX?
>> 
>> Take all classes below and put them in the package 
>> org.openjdk.bench.java.util.
>> ArraysSort class contains all tests for the new versions and ready to use.
>> (it will run all tests in one execution).
>> 
>> https://github.com/iaroslavski/sorting/blob/master/radixsort/ArraysSort.java
>> https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_a15.java
>> https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r20s.java
>> https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_r20p.java
>> 
>> Many thanks,
>> Vladimir
>
> Hi Vladimir (@iaroslavski),
> 
> Please see the JMH data below.
> 
> Thanks,
> Vamsi
> 
> Benchmark              (builder)   (size)  Mode  Cnt       Score      Error  
> Units
> ArraysSort.Int.a15        RANDOM      600  avgt    4       7.096 ±    0.081  
> us/op
> ArraysSort.Int.a15        RANDOM     2000  avgt    4      44.014 ±    1.717  
> us/op
> ArraysSort.Int.a15        RANDOM    90000  avgt    4    4451.444 ±   71.168  
> us/op
> ArraysSort.Int.a15        RANDOM   400000  avgt    4   22751.966 ±  683.597  
> us/op
> ArraysSort.Int.a15        RANDOM  3000000  avgt    4  190326.306 ± 8008.512  
> us/op
> ArraysSort.Int.a15      REPEATED      600  avgt    4       1.044 ±    0.016  
> us/op
> ArraysSort.Int.a15      REPEATED     2000  avgt    4       2.272 ±    0.287  
> us/op
> ArraysSort.Int.a15      REPEATED    90000  avgt    4     412.331 ±   11.656  
> us/op
> ArraysSort.Int.a15      REPEATED   400000  avgt    4    1908.978 ±   30.241  
> us/op
> ArraysSort.Int.a15      REPEATED  3000000  avgt    4   15163.443 ±  100.425  
> us/op
> ArraysSort.Int.a15       STAGGER      600  avgt    4       1.055 ±    0.057  
> us/op
> ArraysSort.Int.a15       STAGGER     2000  avgt    4       3.408 ±    0.096  
> us/op
> ArraysSort.Int.a15       STAGGER    90000  avgt    4     149.220 ±    4.022  
> us/op
> ArraysSort.Int.a15       STAGGER   400000  avgt    4     663.096 ±   30.252  
> us/op
> ArraysSort.Int.a15       STAGGER  3000000  avgt    4    5206.890 ±  234.857  
> us/op
> ArraysSort.Int.a15       SHUFFLE      600  avgt    4       4.611 ±    0.118  
> us/op
> ArraysSort.Int.a15       SHUFFLE     2000  avgt    4      17.955 ±    0.356  
> us/op
> ArraysSort.Int.a15       SHUFFLE    90000  avgt    4    1410.357 ±   41.128  
> us/op
> ArraysSort.Int.a15       SHUFFLE   400000  avgt    4    5739.311 ±  128.270  
> us/op
> ArraysSort.Int.a15       SHUFFLE  3000000  avgt    4   41501.980 ±  829.443  
> us/op
> ArraysSort.Int.jdk        RANDOM      600  avgt    4       1.612 ±    0.088  
> us/op
> ArraysSort.Int.jdk        RANDOM     2000  avgt    4       6.893 ±    0.375  
> us/op
> ArraysSort.Int.jdk        RANDOM    90000  avgt    4     522.749 ±   19.386  
> us/op
> ArraysSort.Int.jdk        RANDOM   400000  avgt    4    2424.204 ±   63.844  
> us/op
> ArraysSort.Int.jdk        RANDOM  3000000  avgt    4   21000.434 ±  801.315  
> us/op
> ArraysSort.Int.jdk      REPEATED      600  avgt    4       0.496 ±    0.030  
> us/op
> ArraysSort.Int.jdk      REPEATED     2000  avgt    4       1.037 ±    0.083  
> us/op
> ArraysSort.Int.jdk      REPEATED    90000  avgt    4      57.763 ±    1.955  
> us/op
> ArraysSort.Int.jd...

Hi Vamsi (@vamsi-parasa), Laurent(@bourgesl),

The latest benchmarking compares compares the following versions:
jdk - direct call of Arrays.sort();
a15 - the current source of DualPivotQuicksort from the latest build (except 
renaming)
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/DualPivotQuicksort.java
https://github.com/iaroslavski/sorting/blob/master/radixsort/DualPivotQuicksort_a15.java
r20s - new version without Radix sort
r20p - new version with Radix sort in parallel case only

It is expected that timing of jdk and a15 should be more or less the same, but 
please look at the results:

Benchmark | Data Type | Array Size | Arrays.sort() from jdk | Current source 
(a15)
-- | -- | -- | -- | --
ArraysSort.Int.testSort | RANDOM   |     600 | 1.612     | 7.096
ArraysSort.Int.testSort | RANDOM   |    2000 | 6.893     | 44.014
ArraysSort.Int.testSort | RANDOM   |   90000 | 522.749   | 4451.444
ArraysSort.Int.testSort | RANDOM   |  400000 | 2424.204  | 22751.966
ArraysSort.Int.testSort | RANDOM   | 3000000 | 21000.434 | 190326.306
ArraysSort.Int.testSort | REPEATED |     600 | 0.496     | 1.044
ArraysSort.Int.testSort | REPEATED |    2000 | 1.037     | 2.272
ArraysSort.Int.testSort | REPEATED |   90000 | 57.763    | 412.331
ArraysSort.Int.testSort | REPEATED |  400000 | 182.238   | 1908.978
ArraysSort.Int.testSort | REPEATED | 3000000 | 1708.082  | 15163.443
ArraysSort.Int.testSort | STAGGER  |     600 | 1.038     | 1.055
ArraysSort.Int.testSort | STAGGER  |    2000 | 3.434     | 3.408
ArraysSort.Int.testSort | STAGGER  |   90000 | 148.638   | 149.220
ArraysSort.Int.testSort | STAGGER  |  400000 | 663.076   | 663.096
ArraysSort.Int.testSort | STAGGER  | 3000000 | 5212.821  | 5206.890
ArraysSort.Int.testSort | SHUFFLE  |     600 | 1.926     | 4.611
ArraysSort.Int.testSort | SHUFFLE  |    2000 | 6.858     | 17.955
ArraysSort.Int.testSort | SHUFFLE  |   90000 | 473.441   | 1410.357
ArraysSort.Int.testSort | SHUFFLE  |  400000 | 2153.779  | 5739.311
ArraysSort.Int.testSort | SHUFFLE  | 3000000 | 18180.141 | 41501.980

You can see that a15 (current source) works extremly slower than Arrays.sort(), 
but the code is the same
with minor differences: renaming and repackaging (I put the class to the test 
package org.openjdk.bench.java.util).
How does other package org.openjdk.bench.java.util effect so much?

I use this pom.xml file 
https://github.com/iaroslavski/sorting/blob/master/radixsort/pom.xml and run as 
following script:

`mvn clean package`
`java --add-exports=java.base/jdk.internal.vm.annotation=ALL-UNNAMED 
--add-exports=java.base/jdk.internal.misc=ALL-UNNAMED -jar 
target/benchmarks.jar org.openjdk.bench.java.util.ArraysSort`

It looks like intrinsics have not been applied. Check STAGGER inputs: you can 
see the same results because
STAGGER switches to merging sort wich doesn't contain intrinsics, but other 
inputs have different behaviour.

Therefore, I compare a15 (current sources) and the new version (r20p).
we can see the new version faster (only one anomaly on REPEATED, 2000).

Benchmark | Data Type | Array Size | Current source (a15) | New version (r20p) 
| Speedup
-- | -- | -- | -- | -- | --
ArraysSort.Int.testSort | RANDOM   |     600 | 7.096      | 4.343      | 1.63
ArraysSort.Int.testSort | RANDOM   |    2000 | 44.014     | 20.323     | 2.17
ArraysSort.Int.testSort | RANDOM   |   90000 | 4451.444   | 4073.793   | 1.09
ArraysSort.Int.testSort | RANDOM   |  400000 | 22751.966  | 19768.316  | 1.15
ArraysSort.Int.testSort | RANDOM   | 3000000 | 190326.306 | 165296.770 | 1.15
ArraysSort.Int.testSort | REPEATED |     600 | 1.044      | 1.048      | 1.00
ArraysSort.Int.testSort | REPEATED |    2000 | 2.272      | 3.212      | 0.71
ArraysSort.Int.testSort | REPEATED |   90000 | 412.331    | 410.775    | 1.00
ArraysSort.Int.testSort | REPEATED |  400000 | 1908.978   | 1869.529   | 1.02
ArraysSort.Int.testSort | REPEATED | 3000000 | 15163.443  | 14302.261  | 1.06
ArraysSort.Int.testSort | STAGGER  |     600 | 1.055      | 0.808      | 1.31
ArraysSort.Int.testSort | STAGGER  |    2000 | 3.408      | 2.683      | 1.27
ArraysSort.Int.testSort | STAGGER  |   90000 | 149.220    | 121.519    | 1.23
ArraysSort.Int.testSort | STAGGER  |  400000 | 663.096    | 548.277    | 1.21
ArraysSort.Int.testSort | STAGGER  | 3000000 | 5206.890   | 4496.204   | 1.16
ArraysSort.Int.testSort | SHUFFLE  |     600 | 4.611      | 3.383      | 1.36
ArraysSort.Int.testSort | SHUFFLE  |    2000 | 17.955     | 11.400     | 1.57
ArraysSort.Int.testSort | SHUFFLE  |   90000 | 1410.357   | 1001.561   | 1.41
ArraysSort.Int.testSort | SHUFFLE  |  400000 | 5739.311   | 4667.466   | 1.23
ArraysSort.Int.testSort | SHUFFLE  | 3000000 | 41501.980  | 36284.178  | 1.14

**Parallel sort**

Benchmark | Data Type | Array Size | Current source (a15) | New version (r20p) 
| Speedup
-- | -- | -- | -- | -- | --
ArraysSort.Int.testParallelSort | RANDOM   |     600 | 7.102     | 4.315    | 
1.65
ArraysSort.Int.testParallelSort | RANDOM   |    2000 | 43.708    | 20.322   | 
2.15
ArraysSort.Int.testParallelSort | RANDOM   |   90000 | 743.725   | 409.892  | 
1.81
ArraysSort.Int.testParallelSort | RANDOM   |  400000 | 3099.628  | 1322.129 | 
2.34
ArraysSort.Int.testParallelSort | RANDOM   | 3000000 | 25830.847 | 9790.138 | 
2.64
ArraysSort.Int.testParallelSort | REPEATED |     600 | 1.042     | 1.033    | 
1.01
ArraysSort.Int.testParallelSort | REPEATED |    2000 | 2.273     | 3.247    | 
0.70
ArraysSort.Int.testParallelSort | REPEATED |   90000 | 204.887   | 184.565  | 
1.11
ArraysSort.Int.testParallelSort | REPEATED |  400000 | 758.837   | 745.817  | 
1.02
ArraysSort.Int.testParallelSort | REPEATED | 3000000 | 7635.330  | 5881.615 | 
1.30
ArraysSort.Int.testParallelSort | STAGGER  |     600 | 1.045     | 0.807    | 
1.29
ArraysSort.Int.testParallelSort | STAGGER  |    2000 | 3.410     | 2.733    | 
1.25
ArraysSort.Int.testParallelSort | STAGGER  |   90000 | 88.093    | 73.869   | 
1.19
ArraysSort.Int.testParallelSort | STAGGER  |  400000 | 218.848   | 208.361  | 
1.05
ArraysSort.Int.testParallelSort | STAGGER  | 3000000 | 2245.299  | 2122.314 | 
1.06
ArraysSort.Int.testParallelSort | SHUFFLE  |     600 | 4.594     | 3.386    | 
1.36
ArraysSort.Int.testParallelSort | SHUFFLE  |    2000 | 17.759    | 11.570   | 
1.53
ArraysSort.Int.testParallelSort | SHUFFLE  |   90000 | 293.962   | 146.273  | 
2.01
ArraysSort.Int.testParallelSort | SHUFFLE  |  400000 | 1017.350  | 766.287  | 
1.33
ArraysSort.Int.testParallelSort | SHUFFLE  | 3000000 | 7419.434  | 5917.997 | 
1.25

Vamsi (@vamsi-parasa),
Could you please check your pom.xml and script to be sure that we use the same 
environment?
Are intrinsics not applied in package org.openjdk.bench.java.util?
Can you rerun benchmarking when classes a15, r20s and r20p are under package 
java.util?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13568#issuecomment-1913741195

Reply via email to