> Although the scalar AVX10 floating point min/max instructions (VMINMAXSD, 
> VMINMAXSS, VMINMAXSH) are compact, it's better not to use them in reduction 
> loops. This is because of serial data dependencies that get triggered across 
> loop iterations. An alternate implementation using comparisons and jumps 
> leverages branch prediction and limits the effects of data dependencies to 
> cheaper instructions (e.g, MOV).
> 
> With that background provided, these changes remove AVX10 floating point 
> min/max instructions from single and double precision floating point 
> reduction loops. Instead, a separate sequence of instructions is used. 
> Currently, min/max half precision floating point reduction loops aren't 
> detectable, so they will be handled in a separate PR. There is also some code 
> cleanup to remove unused instruction definitions while also adding necessary 
> supporting infrastructure. The JTREG tests listed below were used to verify 
> correctness with the recommended JVM options mentioned in corresponding 
> source files. All modifications and tests used [OpenJDK 
> v27-b12](https://github.com/openjdk/jdk/releases/tag/jdk-27%2B12) as the 
> baseline build.
> 
> 1. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector64Tests.java`
> 2. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector128Tests.java`
> 3. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector256Tests.java`
> 4. `jtreg:test/jdk/jdk/incubator/vector/DoubleVector512Tests.java`
> 5. `jtreg:test/jdk/jdk/incubator/vector/DoubleVectorMaxTests.java`
> 6. `jtreg:test/jdk/jdk/incubator/vector/FloatVector64Tests.java`
> 7. `jtreg:test/jdk/jdk/incubator/vector/FloatVector128Tests.java`
> 8. `jtreg:test/jdk/jdk/incubator/vector/FloatVector256Tests.java`
> 9. `jtreg:test/jdk/jdk/incubator/vector/FloatVector512Tests.java`
> 10. `jtreg:test/jdk/jdk/incubator/vector/FloatVectorMaxTests.java`
> 11. 
> `jtreg:test/hotspot/jtreg/compiler/vectorization/TestFloat16VectorOperations.java`
> 12. 
> `jtreg:test/hotspot/jtreg/compiler/c2/irTests/TestFloat16ScalarOperations.java`
> 13. `jtreg:test/hotspot/jtreg/compiler/igvn/TestMinMaxIdentity.java`
> 14. 
> `jtreg:test/hotspot/jtreg/compiler/intrinsics/float16/TestFloat16MaxMinSpecialValues.java`
> 
> Finally, the JMH micro-benchmarks listed below were updated to ensure all 
> code paths are exercised.
> 
> 1. 
> `micro:test/micro/org/openjdk/bench/jdk/incubator/vector/Float16OperationsBenchmark.java`
> 2. `micro:test/micro/org/openjdk/bench/vm/compiler/FpMinMaxIntrinsics.java`

Mohamed Issa has updated the pull request with a new target base due to a merge 
or a rebase. The pull request now contains five commits:

 - Merge branch 'master' into user/missa-prime/avx10_2
 - Remove half precision min/max reduction definitions and adjust corresponding 
benchmarks.
 - Use alternative instruction flow for half precision reduction loops and add 
supporting infrastructure.
 - Merge branch 'master' into user/missa-prime/avx10_2
 - Replace scalar AVX10.2 floating point min/max instructions with more 
efficient sequence

-------------

Changes: https://git.openjdk.org/jdk/pull/29831/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=29831&range=01
  Stats: 601 lines in 9 files changed: 425 ins; 86 del; 90 mod
  Patch: https://git.openjdk.org/jdk/pull/29831.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/29831/head:pull/29831

PR: https://git.openjdk.org/jdk/pull/29831

Reply via email to