On Sat, 1 Apr 2023 07:44:25 GMT, Quan Anh Mai <qa...@openjdk.org> wrote:
>> `Vector::slice` is a method at the top-level class of the Vector API that >> concatenates the 2 inputs into an intermediate composite and extracts a >> window equal to the size of the inputs into the result. It is used in vector >> conversion methods where the part number is not 0 to slice the parts to the >> correct positions. Slicing is also used in text processing such as utf8 and >> utf16 validation. x86 starting from SSSE3 has `palignr` which does vector >> slicing very efficiently. As a result, I think it is beneficial to add a C2 >> node for this operation as well as intrinsify `Vector::slice` method. >> >> A slice is currently implemented as >> `v2.rearrange(iota).blend(v1.rearrange(iota), blendMask)` which requires >> preparation of the index vector and the blending mask. Even with the >> preparations being hoisted out of the loops, microbenchmarks show >> improvement using the slice instrinsics. Some have tremendous increases in >> throughput due to the limitation that a mask of length 2 cannot currently be >> intrinsified, leading to falling back to the Java implementations. >> >> Please take a look and have some reviews. Thank you very much. > > Quan Anh Mai has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains ten commits: > > - instruction asserts > - Merge branch 'master' into sliceIntrinsics > - add comments explaining anonymous classes > - address reviews > - sse2, increase warmup > - aesthetic > - optimise 64B > - add jmh > - vector slice intrinsics src/hotspot/share/opto/vectorIntrinsics.cpp line 1935: > 1933: return false; // should be primitive type > 1934: } > 1935: BasicType elem_bt = elem_type->basic_type(); Code style: It's better to add a blank line between different blocks. src/hotspot/share/opto/vectorIntrinsics.cpp line 1941: > 1939: if (C->print_intrinsics()) { > 1940: tty->print_cr(" ** not supported: arity=2 op=slice vlen=%d > etype=%s ismask=notused", > 1941: num_elem, type2name(elem_bt)); `ismask=notused` could be removed. We used `ismask` in other intrinsics to print whether it is a vector mask operation instead of vector class. src/hotspot/share/opto/vectorIntrinsics.cpp line 1954: > 1952: if (v1 == NULL || v2 == NULL) { > 1953: return false; // operand unboxing failed > 1954: } Suggest to reorder line-1950 and the if-statement in line-1952. And then we doesn't need too more spaces in the variable definition `Node* v1 = unbox_vector(xxx)`. Besides, could we rename variable `o` to `index` or `origin` ? I know you'v used `origin` at the begin, maybe we can rename it to `origin_type`. I see the similari name style in `inline_vector_frombits_coerced`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12909#discussion_r1156645453 PR Review Comment: https://git.openjdk.org/jdk/pull/12909#discussion_r1156646311 PR Review Comment: https://git.openjdk.org/jdk/pull/12909#discussion_r1156653402