> Hi, > > This is just a redo of https://github.com/openjdk/jdk/pull/13093. mostly just > the revert of the backout. > > Regarding the related issues: > > - [JDK-8306008](https://bugs.openjdk.org/browse/JDK-8306008) and > [JDK-8309531](https://bugs.openjdk.org/browse/JDK-8309531) have been fixed > before the backout. > - [JDK-8309373](https://bugs.openjdk.org/browse/JDK-8309373) was due to > missing `ForceInline` on `AbstractVector::toBitsVectorTemplate` > - [JDK-8306592](https://bugs.openjdk.org/browse/JDK-8306592), I have not been > able to find the root causes. I'm not sure if this is a blocker, now I cannot > even build x86-32 tests. > > Finally, I moved some implementation of public methods and methods that call > into intrinsics to the concrete class as that may help the compiler know the > correct types of the variables. > > Please take a look and leave reviews. Thanks a lot. > > The description of the original PR: > > This patch reimplements `VectorShuffle` implementations to be a vector of the > bit type. Currently, `VectorShuffle` is stored as a byte array, and would be > expanded upon usage. This poses several drawbacks: > > Inefficient conversions between a shuffle and its corresponding vector. This > hinders the performance when the shuffle indices are not constant and are > loaded or computed dynamically. > Redundant expansions in `rearrange` operations. On all platforms, it seems > that a shuffle index vector is always expanded to the correct type before > executing the `rearrange` operations. > Some redundant intrinsics are needed to support this handling as well as > special considerations in the C2 compiler. > Range checks are performed using `VectorShuffle::toVector`, which is > inefficient for FP types since both FP conversions and FP comparisons are > more expensive than the integral ones. > Upon these changes, a `rearrange` can emit more efficient code: > > var species = IntVector.SPECIES_128; > var v1 = IntVector.fromArray(species, SRC1, 0); > var v2 = IntVector.fromArray(species, SRC2, 0); > v1.rearrange(v2.toShuffle()).intoArray(DST, 0); > > Before: > movabs $0x751589fa8,%r10 ; {oop([I{0x0000000751589fa8})} > vmovdqu 0x10(%r10),%xmm2 > movabs $0x7515a0d08,%r10 ; {oop([I{0x00000007515a0d08})} > vmovdqu 0x10(%r10),%xmm1 > movabs $0x75158afb8,%r10 ; {oop([I{0x000000075158afb8})} > vmovdqu 0x10(%r10),%xmm0 > vpand -0xddc12(%rip),%xmm0,%xmm0 # Stub::vector_int_to_byte_mask > ; {ex...
Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: adverb order ------------- Changes: - all: https://git.openjdk.org/jdk/pull/21042/files - new: https://git.openjdk.org/jdk/pull/21042/files/85208df1..5cfac5ad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=21042&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=21042&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/21042.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/21042/head:pull/21042 PR: https://git.openjdk.org/jdk/pull/21042