> `Vector::slice` is a method at the top-level class of the Vector API that 
> concatenates the 2 inputs into an intermediate composite and extracts a 
> window equal to the size of the inputs into the result. It is used in vector 
> conversion methods where the part number is not 0 to slice the parts to the 
> correct positions. Slicing is also used in text processing such as utf8 and 
> utf16 validation. x86 starting from SSSE3 has `palignr` which does vector 
> slicing very efficiently. As a result, I think it is beneficial to add a C2 
> node for this operation as well as intrinsify `Vector::slice` method.
> 
> A slice is currently implemented as 
> `v2.rearrange(iota).blend(v1.rearrange(iota), blendMask)` which requires 
> preparation of the index vector and the blending mask. Even with the 
> preparations being hoisted out of the loops, microbenchmarks show improvement 
> using the slice instrinsics. Some have tremendous increases in throughput due 
> to the limitation that a mask of length 2 cannot currently be intrinsified, 
> leading to falling back to the Java implementations.
> 
> Please take a look and have some reviews. Thank you very much.

Quan Anh Mai has updated the pull request with a new target base due to a merge 
or a rebase. The pull request now contains ten commits:

 - instruction asserts
 - Merge branch 'master' into sliceIntrinsics
 - add comments explaining anonymous classes
 - address reviews
 - sse2, increase warmup
 - aesthetic
 - optimise 64B
 - add jmh
 - vector slice intrinsics

-------------

Changes: https://git.openjdk.org/jdk/pull/12909/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12909&range=03
  Stats: 1603 lines in 58 files changed: 1277 ins; 257 del; 69 mod
  Patch: https://git.openjdk.org/jdk/pull/12909.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/12909/head:pull/12909

PR: https://git.openjdk.org/jdk/pull/12909

Reply via email to