On Wed, 19 Feb 2025 17:43:54 GMT, Galder Zamarreño <gal...@openjdk.org> wrote:
>> Galder Zamarreño has updated the pull request with a new target base due to >> a merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains 44 additional >> commits since the last revision: >> >> - Merge branch 'master' into topic.intrinsify-max-min-long >> - Fix typo >> - Renaming methods and variables and add docu on algorithms >> - Fix copyright years >> - Make sure it runs with cpus with either avx512 or asimd >> - Test can only run with 256 bit registers or bigger >> >> * Remove platform dependant check >> and use platform independent configuration instead. >> - Fix license header >> - Tests should also run on aarch64 asimd=true envs >> - Added comment around the assertions >> - Adjust min/max identity IR test expectations after changes >> - ... and 34 more: https://git.openjdk.org/jdk/compare/384bab03...a190ae68 > > I will run a comparison next with the same batch of tests but looking at > `int` and see if there are any differences compared with `long` or not. Hi @galderz, Results from Graviton 3(Neoverse-V1). Without the patch: Benchmark (probability) (range) (seed) (size) Mode Cnt Score Error Units MinMaxVector.intClippingRange N/A 90 0 1000 thrpt 8 12565.427 ± 37.538 ops/ms MinMaxVector.intClippingRange N/A 100 0 1000 thrpt 8 12462.072 ± 84.067 ops/ms MinMaxVector.intLoopMax 50 N/A N/A 2048 thrpt 8 5113.090 ± 68.720 ops/ms MinMaxVector.intLoopMax 80 N/A N/A 2048 thrpt 8 5129.857 ± 35.005 ops/ms MinMaxVector.intLoopMax 100 N/A N/A 2048 thrpt 8 5116.081 ± 8.946 ops/ms MinMaxVector.intLoopMin 50 N/A N/A 2048 thrpt 8 6174.544 ± 52.573 ops/ms MinMaxVector.intLoopMin 80 N/A N/A 2048 thrpt 8 6110.884 ± 54.447 ops/ms MinMaxVector.intLoopMin 100 N/A N/A 2048 thrpt 8 6178.661 ± 48.450 ops/ms MinMaxVector.intReductionMax 50 N/A N/A 2048 thrpt 8 5109.270 ± 10.525 ops/ms MinMaxVector.intReductionMax 80 N/A N/A 2048 thrpt 8 5123.426 ± 28.229 ops/ms MinMaxVector.intReductionMax 100 N/A N/A 2048 thrpt 8 5133.799 ± 7.693 ops/ms MinMaxVector.intReductionMin 50 N/A N/A 2048 thrpt 8 5130.209 ± 15.491 ops/ms MinMaxVector.intReductionMin 80 N/A N/A 2048 thrpt 8 5127.823 ± 27.767 ops/ms MinMaxVector.intReductionMin 100 N/A N/A 2048 thrpt 8 5118.217 ± 22.186 ops/ms MinMaxVector.longClippingRange N/A 90 0 1000 thrpt 8 1831.026 ± 15.502 ops/ms MinMaxVector.longClippingRange N/A 100 0 1000 thrpt 8 1827.194 ± 22.076 ops/ms MinMaxVector.longLoopMax 50 N/A N/A 2048 thrpt 8 2643.383 ± 9.830 ops/ms MinMaxVector.longLoopMax 80 N/A N/A 2048 thrpt 8 2640.417 ± 7.797 ops/ms MinMaxVector.longLoopMax 100 N/A N/A 2048 thrpt 8 1244.321 ± 1.001 ops/ms MinMaxVector.longLoopMin 50 N/A N/A 2048 thrpt 8 3239.234 ± 8.813 ops/ms MinMaxVector.longLoopMin 80 N/A N/A 2048 thrpt 8 3252.713 ± 3.446 ops/ms MinMaxVector.longLoopMin 100 N/A N/A 2048 thrpt 8 1204.370 ± 10.537 ops/ms MinMaxVector.longReductionMax 50 N/A N/A 2048 thrpt 8 2536.322 ± 0.127 ops/ms MinMaxVector.longReductionMax 80 N/A N/A 2048 thrpt 8 2536.318 ± 0.277 ops/ms MinMaxVector.longReductionMax 100 N/A N/A 2048 thrpt 8 1395.273 ± 13.862 ops/ms MinMaxVector.longReductionMin 50 N/A N/A 2048 thrpt 8 2536.325 ± 0.146 ops/ms MinMaxVector.longReductionMin 80 N/A N/A 2048 thrpt 8 2536.265 ± 0.272 ops/ms MinMaxVector.longReductionMin 100 N/A N/A 2048 thrpt 8 1389.982 ± 5.345 ops/ms With the patch: Benchmark (probability) (range) (seed) (size) Mode Cnt Score Error Units MinMaxVector.intClippingRange N/A 90 0 1000 thrpt 8 12598.201 ± 52.631 ops/ms MinMaxVector.intClippingRange N/A 100 0 1000 thrpt 8 12555.284 ± 62.472 ops/ms MinMaxVector.intLoopMax 50 N/A N/A 2048 thrpt 8 5079.499 ± 16.392 ops/ms MinMaxVector.intLoopMax 80 N/A N/A 2048 thrpt 8 5100.673 ± 30.376 ops/ms MinMaxVector.intLoopMax 100 N/A N/A 2048 thrpt 8 5082.544 ± 23.540 ops/ms MinMaxVector.intLoopMin 50 N/A N/A 2048 thrpt 8 6137.512 ± 30.198 ops/ms MinMaxVector.intLoopMin 80 N/A N/A 2048 thrpt 8 6136.233 ± 7.726 ops/ms MinMaxVector.intLoopMin 100 N/A N/A 2048 thrpt 8 6142.262 ± 96.510 ops/ms MinMaxVector.intReductionMax 50 N/A N/A 2048 thrpt 8 5116.055 ± 23.270 ops/ms MinMaxVector.intReductionMax 80 N/A N/A 2048 thrpt 8 5111.481 ± 12.236 ops/ms MinMaxVector.intReductionMax 100 N/A N/A 2048 thrpt 8 5106.367 ± 9.035 ops/ms MinMaxVector.intReductionMin 50 N/A N/A 2048 thrpt 8 5115.666 ± 15.539 ops/ms MinMaxVector.intReductionMin 80 N/A N/A 2048 thrpt 8 5133.127 ± 4.918 ops/ms MinMaxVector.intReductionMin 100 N/A N/A 2048 thrpt 8 5120.469 ± 24.355 ops/ms MinMaxVector.longClippingRange N/A 90 0 1000 thrpt 8 5094.259 ± 14.092 ops/ms MinMaxVector.longClippingRange N/A 100 0 1000 thrpt 8 5096.835 ± 16.517 ops/ms MinMaxVector.longLoopMax 50 N/A N/A 2048 thrpt 8 2636.438 ± 18.760 ops/ms MinMaxVector.longLoopMax 80 N/A N/A 2048 thrpt 8 2644.069 ± 3.933 ops/ms MinMaxVector.longLoopMax 100 N/A N/A 2048 thrpt 8 2646.250 ± 2.007 ops/ms MinMaxVector.longLoopMin 50 N/A N/A 2048 thrpt 8 2648.504 ± 18.294 ops/ms MinMaxVector.longLoopMin 80 N/A N/A 2048 thrpt 8 2658.082 ± 3.362 ops/ms MinMaxVector.longLoopMin 100 N/A N/A 2048 thrpt 8 2647.532 ± 5.600 ops/ms MinMaxVector.longReductionMax 50 N/A N/A 2048 thrpt 8 2536.254 ± 0.086 ops/ms MinMaxVector.longReductionMax 80 N/A N/A 2048 thrpt 8 2536.209 ± 0.129 ops/ms MinMaxVector.longReductionMax 100 N/A N/A 2048 thrpt 8 2536.342 ± 0.068 ops/ms MinMaxVector.longReductionMin 50 N/A N/A 2048 thrpt 8 2536.271 ± 0.203 ops/ms MinMaxVector.longReductionMin 80 N/A N/A 2048 thrpt 8 2536.250 ± 0.343 ops/ms MinMaxVector.longReductionMin 100 N/A N/A 2048 thrpt 8 2536.246 ± 0.179 ops/ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2669613497