On Wed, 25 Oct 2023 04:34:59 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:

> Hi All,
> 
> This patch optimizes sub-word gather operation for x86 targets with AVX2 and 
> AVX512 features.
> 
> Following is the summary of changes:-
> 
> 1) Intrinsify sub-word gather with high performance backend implementation 
> based on hybrid algorithm which initially partially unrolls scalar loop to 
> accumulates values from gather indices into a quadword(64bit) slice followed 
> by vector permutation to place the slice into appropriate vector lanes, it 
> prevents code bloating and generates compact
> JIT sequence. This coupled with savings from expansive array allocation in 
> existing java implementation translates into significant performance of 
> 1.3-5x gains with included micro.
> 
> 
> ![image](https://github.com/openjdk/jdk/assets/59989778/e25ba4ad-6a61-42fa-9566-452f741a9c6d)
> 
> 
> 2) Patch was also compared against modified java fallback implementation by 
> replacing temporary array allocation with zero initialized vector and a 
> scalar loops which inserts gathered values into vector. But, vector insert 
> operation in higher vector lanes is a three step process which first extracts 
> the upper vector 128 bit lane, updates it with gather subword value and then 
> inserts the lane back to its original position. This makes inserts into 
> higher order lanes costly w.r.t to proposed solution. In addition generated 
> JIT code for modified fallback implementation was very bulky. This may impact 
> in-lining decisions into caller contexts.
> 
> 3) Some minor adjustments in existing gather instruction pattens for 
> double/quad words.
> 
> 
> Kindly review and share your feedback.
> 
> 
> Best Regards,
> Jatin

**Detailed performance numberers with AVX2**

Benchmark | Size | Baseline Score (ops/ms) | WithOpt Score (ops/ms) | Gain 
Factor (opt/baseline)
-- | -- | -- | -- | --
GatherOperationsBenchmark.microByteGather128 | 64 | 15916.774 | 34288.944 | 
2.154264677
GatherOperationsBenchmark.microByteGather128 | 256 | 4128.501 | 8793.293 | 
2.12989969
GatherOperationsBenchmark.microByteGather128 | 1024 | 1027.606 | 2217.138 | 
2.157575958
GatherOperationsBenchmark.microByteGather128 | 4096 | 264.002 | 554.603 | 
2.100753025
GatherOperationsBenchmark.microByteGather128_MASK | 64 | 16729.183 | 26308.667 
| 1.57262115
GatherOperationsBenchmark.microByteGather128_MASK | 256 | 4157.73 | 7312.934 | 
1.758876599
GatherOperationsBenchmark.microByteGather128_MASK | 1024 | 1067.675 | 1828.035 
| 1.712164282
GatherOperationsBenchmark.microByteGather128_MASK | 4096 | 268.538 | 462.191 | 
1.721138163
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 64 | 16559.725 | 
25355.415 | 1.531149521
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 256 | 4190.36 | 
6596.82 | 1.574284787
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 1024 | 1070.641 | 
1638.323 | 1.530226285
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 4096 | 274.703 | 
415.345 | 1.511978391
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 64 | 15445.814 | 30518.41 
| 1.975836948
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 256 | 4087.154 | 8075.382 
| 1.975795872
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 1024 | 1035.527 | 
2008.003 | 1.939112162
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 4096 | 262.936 | 501.675 
| 1.907973804
GatherOperationsBenchmark.microByteGather256 | 64 | 18266.25 | 37549.708 | 
2.05568784
GatherOperationsBenchmark.microByteGather256 | 256 | 4714.027 | 9894.099 | 
2.098863456
GatherOperationsBenchmark.microByteGather256 | 1024 | 1147.282 | 2490.351 | 
2.1706529
GatherOperationsBenchmark.microByteGather256 | 4096 | 286.935 | 622.153 | 
2.16827156
GatherOperationsBenchmark.microByteGather256_MASK | 64 | 21992.019 | 27357.032 
| 1.243952727
GatherOperationsBenchmark.microByteGather256_MASK | 256 | 5732.258 | 7760.398 | 
1.353811709
GatherOperationsBenchmark.microByteGather256_MASK | 1024 | 1495.632 | 1964.343 
| 1.313386582
GatherOperationsBenchmark.microByteGather256_MASK | 4096 | 386.313 | 480.509 | 
1.243833368
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 64 | 19911.793 | 
26818.552 | 1.346867758
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 256 | 5013.248 | 
7040.98 | 1.404474704
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 1024 | 1289.123 | 
1785.368 | 1.384947751
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 4096 | 332.791 | 
452.568 | 1.359916584
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 64 | 17147.769 | 
33913.351 | 1.977712144
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 256 | 4386.044 | 8640.734 
| 1.970051828
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 1024 | 1097.485 | 
2261.998 | 2.061074183
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 4096 | 277.155 | 565.051 
| 2.038754488
GatherOperationsBenchmark.microByteGather64 | 64 | 13068.085 | 37960.616 | 
2.904833876
GatherOperationsBenchmark.microByteGather64 | 256 | 3227.857 | 9935.642 | 
3.078092369
GatherOperationsBenchmark.microByteGather64 | 1024 | 834.99 | 2530.696 | 
3.03080995
GatherOperationsBenchmark.microByteGather64 | 4096 | 212.664 | 637.938 | 
2.999746078
GatherOperationsBenchmark.microByteGather64_MASK | 64 | 13548.225 | 30755.634 | 
2.27008586
GatherOperationsBenchmark.microByteGather64_MASK | 256 | 3347.844 | 8026.22 | 
2.39742951
GatherOperationsBenchmark.microByteGather64_MASK | 1024 | 843.279 | 2072.913 | 
2.458157976
GatherOperationsBenchmark.microByteGather64_MASK | 4096 | 213.316 | 544.853 | 
2.554205967
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 64 | 12982.383 | 
28193.925 | 2.171706458
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 256 | 3288.497 | 
7483.684 | 2.275715623
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 1024 | 834.342 | 
1860.542 | 2.229951267
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 4096 | 208.107 | 
473.987 | 2.277611998
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 64 | 13079.567 | 32992.977 
| 2.522482357
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 256 | 3321.098 | 8987.837 
| 2.706284789
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 1024 | 865.324 | 2362.563 
| 2.73026404
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 4096 | 216.768 | 575.35 | 
2.65422018
GatherOperationsBenchmark.microShortGather128 | 64 | 12835.472 | 31370.111 | 
2.44401694
GatherOperationsBenchmark.microShortGather128 | 256 | 3151.091 | 8603.442 | 
2.730305789
GatherOperationsBenchmark.microShortGather128 | 1024 | 820.026 | 2158.645 | 
2.632410436
GatherOperationsBenchmark.microShortGather128 | 4096 | 205.263 | 535.444 | 
2.60857534
GatherOperationsBenchmark.microShortGather128_MASK | 64 | 13055.905 | 23957.317 
| 1.834979421
GatherOperationsBenchmark.microShortGather128_MASK | 256 | 3234.501 | 6416.879 
| 1.983885304
GatherOperationsBenchmark.microShortGather128_MASK | 1024 | 829.648 | 1578.415 
| 1.902511668
GatherOperationsBenchmark.microShortGather128_MASK | 4096 | 206.04 | 416.303 | 
2.02049602
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 64 | 12905.373 | 
22475.815 | 1.74158585
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 256 | 3202.372 | 
5695.988 | 1.778677805
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 1024 | 814.645 | 
1412.466 | 1.733842349
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 4096 | 199.535 | 
355.407 | 1.781176235
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 64 | 12329.793 | 
27620.341 | 2.240130147
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 256 | 3146.016 | 7664.47 
| 2.436246351
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 1024 | 794.335 | 
1925.535 | 2.424084297
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 4096 | 195.754 | 485.942 
| 2.482411598
GatherOperationsBenchmark.microShortGather256 | 64 | 15430.153 | 33050.636 | 
2.141951282
GatherOperationsBenchmark.microShortGather256 | 256 | 4042.835 | 8901.664 | 
2.201837077
GatherOperationsBenchmark.microShortGather256 | 1024 | 986.361 | 2180.195 | 
2.210341853
GatherOperationsBenchmark.microShortGather256 | 4096 | 250.057 | 560.523 | 
2.24158092
GatherOperationsBenchmark.microShortGather256_MASK | 64 | 16793.012 | 23516.915 
| 1.400398868
GatherOperationsBenchmark.microShortGather256_MASK | 256 | 4249.641 | 6505.857 
| 1.5309192
GatherOperationsBenchmark.microShortGather256_MASK | 1024 | 1105.868 | 1600.44 
| 1.447225166
GatherOperationsBenchmark.microShortGather256_MASK | 4096 | 268.443 | 410.052 | 
1.527519809
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 64 | 16107.265 | 
22559.877 | 1.400602585
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 256 | 4035.417 | 
5872.376 | 1.455209214
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 1024 | 1028.671 | 
1469.825 | 1.428858206
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 4096 | 258.639 | 
370.997 | 1.434420176
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 64 | 14761.16 | 
27601.245 | 1.869856095
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 256 | 3905.104 | 
7684.751 | 1.967873583
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 1024 | 986.575 | 
1853.319 | 1.878538378
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 4096 | 248.541 | 485.734 
| 1.954341537
GatherOperationsBenchmark.microShortGather64 | 64 | 7942.618 | 33097.908 | 
4.167128269
GatherOperationsBenchmark.microShortGather64 | 256 | 2009.148 | 9039.775 | 
4.499307667
GatherOperationsBenchmark.microShortGather64 | 1024 | 506.769 | 2198.022 | 
4.33732529
GatherOperationsBenchmark.microShortGather64 | 4096 | 118.499 | 565.551 | 
4.772622554
GatherOperationsBenchmark.microShortGather64_MASK | 64 | 7802.345 | 23559.186 | 
3.019500676
GatherOperationsBenchmark.microShortGather64_MASK | 256 | 1917.049 | 6278.454 | 
3.275061827
GatherOperationsBenchmark.microShortGather64_MASK | 1024 | 491.248 | 1569.524 | 
3.194972804
GatherOperationsBenchmark.microShortGather64_MASK | 4096 | 117.255 | 398.438 | 
3.398046992
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 64 | 7697.165 | 
22599.8 | 2.936119987
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 256 | 1913.269 | 
5986.04 | 3.128697533
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 1024 | 483.724 | 
1491.969 | 3.084339417
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 4096 | 116.716 | 
375.492 | 3.217142465
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 64 | 7882.26 | 29755.573 
| 3.775005265
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 256 | 1992.655 | 7969.383 
| 3.99937922
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 1024 | 498.249 | 1997.082 
| 4.008200719
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 4096 | 117.764 | 497.177 
| 4.221808023



</body>

</html>

**Detailed performance numbers with AVX3**


Benchmark | Size | Baseline Score (ops/ms) | WithOpt Score (ops/ms) | Gain 
Factor (opt/baseline)
-- | -- | -- | -- | --
GatherOperationsBenchmark.microByteGather128 | 64 | 15900.681 | 35745.941 | 
2.248076104
GatherOperationsBenchmark.microByteGather128 | 256 | 4194.349 | 9931.187 | 
2.36775409
GatherOperationsBenchmark.microByteGather128 | 1024 | 1064.611 | 2528.468 | 
2.375015851
GatherOperationsBenchmark.microByteGather128 | 4096 | 270.486 | 633.351 | 
2.341529691
GatherOperationsBenchmark.microByteGather128_MASK | 64 | 17836.944 | 30418.654 
| 1.705373634
GatherOperationsBenchmark.microByteGather128_MASK | 256 | 4411.449 | 8451.317 | 
1.915768946
GatherOperationsBenchmark.microByteGather128_MASK | 1024 | 1155.587 | 2119.895 
| 1.8344746
GatherOperationsBenchmark.microByteGather128_MASK | 4096 | 287.88 | 538.807 | 
1.871637488
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 64 | 16678.512 | 
27223.074 | 1.632224385
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 256 | 4268.674 | 
7395.33 | 1.732465398
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 1024 | 1119.764 | 
1854.529 | 1.656178445
GatherOperationsBenchmark.microByteGather128_MASK_NZ_OFF | 4096 | 276.836 | 
469.102 | 1.694512274
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 64 | 15561.662 | 
33674.023 | 2.163909163
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 256 | 4065.922 | 9427.52 
| 2.318667205
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 1024 | 1030.027 | 
2430.395 | 2.359544944
GatherOperationsBenchmark.microByteGather128_NZ_OFF | 4096 | 261.51 | 609.811 | 
2.331884058
GatherOperationsBenchmark.microByteGather256 | 64 | 17993.999 | 36026.071 | 
2.002115872
GatherOperationsBenchmark.microByteGather256 | 256 | 4646.105 | 9695.417 | 
2.086783876
GatherOperationsBenchmark.microByteGather256 | 1024 | 1131.979 | 2487.113 | 
2.197137049
GatherOperationsBenchmark.microByteGather256 | 4096 | 278.159 | 624.745 | 
2.24599959
GatherOperationsBenchmark.microByteGather256_MASK | 64 | 22898.291 | 30126.448 
| 1.315663601
GatherOperationsBenchmark.microByteGather256_MASK | 256 | 5473.285 | 8843.556 | 
1.615767496
GatherOperationsBenchmark.microByteGather256_MASK | 1024 | 1415.369 | 2230.048 
| 1.575594774
GatherOperationsBenchmark.microByteGather256_MASK | 4096 | 358.725 | 556.882 | 
1.552392501
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 64 | 20186.469 | 
27915.464 | 1.382879988
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 256 | 5214.919 | 
7578.939 | 1.453318642
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 1024 | 1360.825 | 
1902.398 | 1.397974023
GatherOperationsBenchmark.microByteGather256_MASK_NZ_OFF | 4096 | 359.569 | 
487.59 | 1.356040148
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 64 | 17154.31 | 35904.295 
| 2.093018897
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 256 | 4404.264 | 8997.564 
| 2.042921133
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 1024 | 1098.961 | 
2317.713 | 2.109003868
GatherOperationsBenchmark.microByteGather256_NZ_OFF | 4096 | 275.722 | 576.866 
| 2.092201565
GatherOperationsBenchmark.microByteGather512 | 64 | 18790.829 | 38455.649 | 
2.046511572
GatherOperationsBenchmark.microByteGather512 | 256 | 4806.001 | 10023.706 | 
2.085664568
GatherOperationsBenchmark.microByteGather512 | 1024 | 1164.771 | 2558.357 | 
2.19644634
GatherOperationsBenchmark.microByteGather512 | 4096 | 286.714 | 640.06 | 
2.232398836
GatherOperationsBenchmark.microByteGather512_MASK | 64 | 25265.683 | 32738.543 
| 1.295771145
GatherOperationsBenchmark.microByteGather512_MASK | 256 | 6417.048 | 8900.835 | 
1.387060686
GatherOperationsBenchmark.microByteGather512_MASK | 1024 | 1726.425 | 2231.39 | 
1.29249171
GatherOperationsBenchmark.microByteGather512_MASK | 4096 | 438.445 | 562.29 | 
1.282464163
GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF | 64 | 22097.788 | 
29326.431 | 1.32712066
GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF | 256 | 5587.934 | 
7937.573 | 1.420484387
GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF | 1024 | 1485.739 | 
1967.966 | 1.324570466
GatherOperationsBenchmark.microByteGather512_MASK_NZ_OFF | 4096 | 350.395 | 
502.295 | 1.433510752
GatherOperationsBenchmark.microByteGather512_NZ_OFF | 64 | 17904.091 | 
34883.849 | 1.948373084
GatherOperationsBenchmark.microByteGather512_NZ_OFF | 256 | 4532.971 | 9354.373 
| 2.063629571
GatherOperationsBenchmark.microByteGather512_NZ_OFF | 1024 | 1135.769 | 
2394.267 | 2.108058065
GatherOperationsBenchmark.microByteGather512_NZ_OFF | 4096 | 285.823 | 588.764 
| 2.059890212
GatherOperationsBenchmark.microByteGather64 | 64 | 13044.341 | 32947.355 | 
2.525796819
GatherOperationsBenchmark.microByteGather64 | 256 | 3244.318 | 8817.036 | 
2.717685504
GatherOperationsBenchmark.microByteGather64 | 1024 | 812.016 | 2205.047 | 
2.715521615
GatherOperationsBenchmark.microByteGather64 | 4096 | 212.882 | 559.439 | 
2.627930027
GatherOperationsBenchmark.microByteGather64_MASK | 64 | 13328.592 | 25055.284 | 
1.879814762
GatherOperationsBenchmark.microByteGather64_MASK | 256 | 3294.445 | 6779.14 | 
2.057748726
GatherOperationsBenchmark.microByteGather64_MASK | 1024 | 832.091 | 1693.255 | 
2.034939688
GatherOperationsBenchmark.microByteGather64_MASK | 4096 | 213.049 | 432.162 | 
2.028462936
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 64 | 12713.428 | 
23246.19 | 1.828475373
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 256 | 3238.82 | 
6168.581 | 1.904576667
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 1024 | 819.699 | 
1548.811 | 1.889487483
GatherOperationsBenchmark.microByteGather64_MASK_NZ_OFF | 4096 | 207.266 | 
388.488 | 1.874345045
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 64 | 12740.659 | 28229.421 
| 2.215695515
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 256 | 3255.884 | 7816.444 
| 2.400713293
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 1024 | 831.691 | 1976.915 
| 2.376982557
GatherOperationsBenchmark.microByteGather64_NZ_OFF | 4096 | 210.812 | 503.111 | 
2.386538717
GatherOperationsBenchmark.microShortGather128 | 64 | 12858.925 | 34710.696 | 
2.699346641
GatherOperationsBenchmark.microShortGather128 | 256 | 3106.472 | 9171.459 | 
2.952371372
GatherOperationsBenchmark.microShortGather128 | 1024 | 819.192 | 2278.838 | 
2.781811834
GatherOperationsBenchmark.microShortGather128 | 4096 | 204.157 | 575.636 | 
2.819575131
GatherOperationsBenchmark.microShortGather128_MASK | 64 | 12528.506 | 28202.741 
| 2.251085724
GatherOperationsBenchmark.microShortGather128_MASK | 256 | 3236.653 | 7798 | 
2.409278968
GatherOperationsBenchmark.microShortGather128_MASK | 1024 | 820.409 | 1991.597 
| 2.427566007
GatherOperationsBenchmark.microShortGather128_MASK | 4096 | 203.635 | 509.145 | 
2.500282368
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 64 | 12166.914 | 
25418.26 | 2.089129585
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 256 | 3202.89 | 
6914.467 | 2.158821252
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 1024 | 799.485 | 
1752.541 | 2.192087406
GatherOperationsBenchmark.microShortGather128_MASK_NZ_OFF | 4096 | 199.868 | 
442.822 | 2.215572278
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 64 | 12531.197 | 
31505.217 | 2.514142663
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 256 | 3150.884 | 
9098.353 | 2.887555683
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 1024 | 806.932 | 2280.42 
| 2.826037386
GatherOperationsBenchmark.microShortGather128_NZ_OFF | 4096 | 197.351 | 571.426 
| 2.895480641
GatherOperationsBenchmark.microShortGather256 | 64 | 15255.994 | 34013.422 | 
2.22951202
GatherOperationsBenchmark.microShortGather256 | 256 | 3986.306 | 9196.43 | 
2.307005533
GatherOperationsBenchmark.microShortGather256 | 1024 | 1003.058 | 2294.437 | 
2.287442002
GatherOperationsBenchmark.microShortGather256 | 4096 | 257.45 | 560.259 | 
2.176185667
GatherOperationsBenchmark.microShortGather256_MASK | 64 | 17194.868 | 26817.506 
| 1.559622673
GatherOperationsBenchmark.microShortGather256_MASK | 256 | 4307.911 | 7252.799 
| 1.683600009
GatherOperationsBenchmark.microShortGather256_MASK | 1024 | 1109.034 | 1803.594 
| 1.626274758
GatherOperationsBenchmark.microShortGather256_MASK | 4096 | 274.104 | 458.023 | 
1.670982547
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 64 | 15615.164 | 
25553.091 | 1.636427962
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 256 | 4041.88 | 
6826.642 | 1.688976912
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 1024 | 1037.741 | 
1706.208 | 1.644155912
GatherOperationsBenchmark.microShortGather256_MASK_NZ_OFF | 4096 | 257.732 | 
439.843 | 1.706590567
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 64 | 14795.183 | 
31449.844 | 2.125681311
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 256 | 3931.158 | 
8463.181 | 2.15284682
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 1024 | 989.033 | 
2120.528 | 2.144041705
GatherOperationsBenchmark.microShortGather256_NZ_OFF | 4096 | 247.472 | 537.454 
| 2.171777009
GatherOperationsBenchmark.microShortGather512 | 64 | 17122.803 | 33007.209 | 
1.927675568
GatherOperationsBenchmark.microShortGather512 | 256 | 4378.354 | 9043.805 | 
2.065571902
GatherOperationsBenchmark.microShortGather512 | 1024 | 1058.899 | 2292.852 | 
2.165316994
GatherOperationsBenchmark.microShortGather512 | 4096 | 255.502 | 577.995 | 
2.262193642
GatherOperationsBenchmark.microShortGather512_MASK | 64 | 21232.499 | 27840.109 
| 1.311202652
GatherOperationsBenchmark.microShortGather512_MASK | 256 | 5520.535 | 8068.078 
| 1.461466688
GatherOperationsBenchmark.microShortGather512_MASK | 1024 | 1424.798 | 2058.709 
| 1.444912893
GatherOperationsBenchmark.microShortGather512_MASK | 4096 | 355.353 | 472.564 | 
1.329843845
GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF | 64 | 19155.458 | 
25609.616 | 1.336935718
GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF | 256 | 4927.708 | 
7004.695 | 1.421491493
GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF | 1024 | 1275.924 | 
1773.667 | 1.390103956
GatherOperationsBenchmark.microShortGather512_MASK_NZ_OFF | 4096 | 312.012 | 
444.827 | 1.425672731
GatherOperationsBenchmark.microShortGather512_NZ_OFF | 64 | 16195.665 | 
33555.966 | 2.071910354
GatherOperationsBenchmark.microShortGather512_NZ_OFF | 256 | 4123.235 | 7852.8 
| 1.904523996
GatherOperationsBenchmark.microShortGather512_NZ_OFF | 1024 | 1033.009 | 
1722.473 | 1.667432714
GatherOperationsBenchmark.microShortGather512_NZ_OFF | 4096 | 256.001 | 441.754 
| 1.725594822
GatherOperationsBenchmark.microShortGather64 | 64 | 7717.303 | 34015.02 | 
4.40763049
GatherOperationsBenchmark.microShortGather64 | 256 | 1940.575 | 9191.168 | 
4.73631166
GatherOperationsBenchmark.microShortGather64 | 1024 | 490.29 | 2294.718 | 
4.680327969
GatherOperationsBenchmark.microShortGather64 | 4096 | 117.59 | 579.407 | 
4.927349264
GatherOperationsBenchmark.microShortGather64_MASK | 64 | 7325.501 | 26815.558 | 
3.660576662
GatherOperationsBenchmark.microShortGather64_MASK | 256 | 1792.717 | 7265.925 | 
4.053023985
GatherOperationsBenchmark.microShortGather64_MASK | 1024 | 456.88 | 1805.418 | 
3.951624059
GatherOperationsBenchmark.microShortGather64_MASK | 4096 | 109.657 | 454.874 | 
4.148152877
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 64 | 7157.769 | 
25542.547 | 3.568506751
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 256 | 1763.398 | 
6817.439 | 3.866080715
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 1024 | 453.254 | 
1704.752 | 3.761140553
GatherOperationsBenchmark.microShortGather64_MASK_NZ_OFF | 4096 | 108.622 | 
439.762 | 4.0485537
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 64 | 7708.116 | 31334.47 
| 4.065126939
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 256 | 1954.153 | 8460.249 
| 4.329368785
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 1024 | 493.538 | 2120.589 
| 4.296708663
GatherOperationsBenchmark.microShortGather64_NZ_OFF | 4096 | 116.976 | 537.472 
| 4.594720285

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16354#issuecomment-1778499766
PR Comment: https://git.openjdk.org/jdk/pull/16354#issuecomment-1778500434

Reply via email to