On Wed, 10 Jul 2024 14:24:05 GMT, Jasmine Karthikeyan <jkarthike...@openjdk.org> wrote:
> The C2 changes look nice! I just added one comment here about style. It would > also be good to add some IR tests checking that the intrinsic is creating > `MaxL`/`MinL` nodes before macro expansion, and a microbenchmark to compare > results. Thanks for the review. +1 to the IR tests, I'll work on those. Re: microbenchmark - what do you have exactly in mind? For vectorization performance there is `ReductionPerf` though it's not a microbenchmark per se. Do you want a microbenchmark for the performance of vectorized max/min long? For non-vectorization performance there is `MathBench`. I would not expect performance differences in `MathBench` because the backend is still the same and this change really benefits vectorization. I've run the min/max long tests on darwin/aarch64 and linux/x64 and indeed I see no difference: linux/x64 Benchmark (seed) Mode Cnt Score Error Units MathBench.maxLong 0 thrpt 8 1464197.164 ± 27044.205 ops/ms # base MathBench.minLong 0 thrpt 8 1469917.328 ± 25397.401 ops/ms # base MathBench.maxLong 0 thrpt 8 1469615.250 ± 17950.429 ops/ms # patched MathBench.minLong 0 thrpt 8 1456290.514 ± 44455.727 ops/ms # patched darwin/aarch64 Benchmark (seed) Mode Cnt Score Error Units MathBench.maxLong 0 thrpt 8 1739341.447 ? 210983.444 ops/ms # base MathBench.minLong 0 thrpt 8 1659547.649 ? 260554.159 ops/ms # base MathBench.maxLong 0 thrpt 8 1660449.074 ? 254534.725 ops/ms # patched MathBench.minLong 0 thrpt 8 1729728.021 ? 16327.575 ops/ms # patched ------------- PR Comment: https://git.openjdk.org/jdk/pull/20098#issuecomment-2232836799