> This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in > order to help improve vectorization performance. > > Currently vectorization does not kick in for loops containing either of these > calls because of the following error: > > > VLoop::check_preconditions: failed: control flow in loop not allowed > > > The control flow is due to the java implementation for these methods, e.g. > > > public static long max(long a, long b) { > return (a >= b) ? a : b; > } > > > This patch intrinsifies the calls to replace the CmpL + Bool nodes for > MaxL/MinL nodes respectively. > By doing this, vectorization no longer finds the control flow and so it can > carry out the vectorization. > E.g. > > > SuperWord::transform_loop: > Loop: N518/N126 counted [int,int),+4 (1025 iters) main has_sfpt > strip_mined > 518 CountedLoop === 518 246 126 [[ 513 517 518 242 521 522 422 210 ]] > inner stride: 4 main of N518 strip mined !orig=[419],[247],[216],[193] !jvms: > Test::test @ bci:14 (line 21) > > > Applying the same changes to `ReductionPerf` as in > https://github.com/openjdk/jdk/pull/13056, we can compare the results before > and after. Before the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1155 > long max 1173 > > > After the patch, on darwin/aarch64 (M1): > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java > 1 1 0 0 > ============================== > TEST SUCCESS > > long min 1042 > long max 1042 > > > This patch does not add an platform-specific backend implementations for the > MaxL/MinL nodes. > Therefore, it still relies on the macro expansion to transform those into > CMoveL. > > I've run tier1 and hotspot compiler tests on darwin/aarch64 and got these > results: > > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg:tier1 2500 2500 0 0 >>> jtreg:test/jdk:tier1 ...
Galder Zamarreño has updated the pull request incrementally with 17 additional commits since the last revision: - Remove previous benchmark effort - Multiply array value in reduction for vectorization to kick in - Renamed benchmark methods - Add min/max benchmark that includes loops and reductions - Skip single array benchmarks - Add an intermediate % that is more representative of real life - Fix compilation error - Fix min case to distribute numbers as per probability - Distribute values targetting a branch percentage * Use a random increment algorithm, to create an array of values such that min/max branch percentage matches. - Fix format of assembly for the movl to movq switch - ... and 7 more: https://git.openjdk.org/jdk/compare/3dd72b89...28778c84 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/20098/files - new: https://git.openjdk.org/jdk/pull/20098/files/3dd72b89..28778c84 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=20098&range=00-01 Stats: 562 lines in 5 files changed: 418 ins; 132 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/20098.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/20098/head:pull/20098 PR: https://git.openjdk.org/jdk/pull/20098