On Wed, 18 Sep 2024 17:00:30 GMT, Sandhya Viswanathan 
<sviswanat...@openjdk.org> wrote:

>> Currently the rearrange and selectFrom APIs check shuffle indices and throw 
>> IndexOutOfBoundsException if there is any exceptional source index in the 
>> shuffle. This causes the generated code to be less optimal. This PR modifies 
>> the rearrange/selectFrom Vector API methods to perform wrapIndexes instead 
>> of checkIndexes and performs optimizations to generate efficient code.
>> 
>> Summary of changes is as follows:
>>  1) The rearrange/selectFrom methods do wrapIndexes instead of checkIndexes.
>>  2) Intrinsic for wrapIndexes and selectFrom to generate efficient code
>> 
>> For the following source:
>> 
>> 
>>     public void test() {
>>         var index = ByteVector.fromArray(bspecies128, shuffles[1], 0);
>>         for (int j = 0; j < bspecies128.loopBound(size); j += 
>> bspecies128.length()) {
>>             var inpvect = ByteVector.fromArray(bspecies128, byteinp, j);
>>             index.selectFrom(inpvect).intoArray(byteres, j);
>>         }
>>     }
>> 
>> 
>> The code generated for inner main now looks as follows:
>> ;; B24: #      out( B24 B25 ) <- in( B23 B24 ) Loop( B24-B24 inner main of 
>> N173 strip mined) Freq: 4160.96
>>   0x00007f40d02274d0:   movslq %ebx,%r13
>>   0x00007f40d02274d3:   vmovdqu 0x10(%rsi,%r13,1),%xmm1
>>   0x00007f40d02274da:   vpshufb %xmm2,%xmm1,%xmm1
>>   0x00007f40d02274df:   vmovdqu %xmm1,0x10(%rax,%r13,1)
>>   0x00007f40d02274e6:   vmovdqu 0x20(%rsi,%r13,1),%xmm1
>>   0x00007f40d02274ed:   vpshufb %xmm2,%xmm1,%xmm1
>>   0x00007f40d02274f2:   vmovdqu %xmm1,0x20(%rax,%r13,1)
>>   0x00007f40d02274f9:   vmovdqu 0x30(%rsi,%r13,1),%xmm1
>>   0x00007f40d0227500:   vpshufb %xmm2,%xmm1,%xmm1
>>   0x00007f40d0227505:   vmovdqu %xmm1,0x30(%rax,%r13,1)
>>   0x00007f40d022750c:   vmovdqu 0x40(%rsi,%r13,1),%xmm1
>>   0x00007f40d0227513:   vpshufb %xmm2,%xmm1,%xmm1
>>   0x00007f40d0227518:   vmovdqu %xmm1,0x40(%rax,%r13,1)
>>   0x00007f40d022751f:   add    $0x40,%ebx
>>   0x00007f40d0227522:   cmp    %r8d,%ebx
>>   0x00007f40d0227525:   jl     0x00007f40d02274d0
>> 
>> Best Regards,
>> Sandhya
>
> Sandhya Viswanathan has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Change method name

src/hotspot/share/opto/vectorIntrinsics.cpp line 772:

> 770: 
> 771:   if (elem_klass == nullptr || shuffle_klass == nullptr || 
> shuffle->is_top() || vlen == nullptr) {
> 772:     return false; // dead code

Why dead code  in comment ? this is a failed intrinsification condition.

src/hotspot/share/opto/vectorIntrinsics.cpp line 776:

> 774:   if (!vlen->is_con() || shuffle_klass->const_oop() == nullptr) {
> 775:     return false; // not enough info for intrinsification
> 776:   }

Why don't you club it with above conditions to be consistent with other inline 
expanders ?

src/hotspot/share/opto/vectorIntrinsics.cpp line 790:

> 788:   // Shuffles use byte array based backing storage
> 789:   BasicType shuffle_bt = T_BYTE;
> 790: 

No need a of new line b/w 789 and 791

src/hotspot/share/opto/vectorIntrinsics.cpp line 793:

> 791:   if (!arch_supports_vector(Op_AndV, num_elem, shuffle_bt, 
> VecMaskNotUsed) ||
> 792:       !arch_supports_vector(Op_Replicate, num_elem, shuffle_bt, 
> VecMaskNotUsed)) {
> 793:     return false;

You should emit proper intrinsification failure message here.

src/hotspot/share/opto/vectorIntrinsics.cpp line 805:

> 803:   const TypeVect* vt  = TypeVect::make(shuffle_bt, num_elem);
> 804:   const Type* shuffle_type_bt = Type::get_const_basic_type(shuffle_bt);
> 805: 

No need of a blank line here.

src/hotspot/share/opto/vectorIntrinsics.cpp line 808:

> 806:   Node* mod_mask = gvn().makecon(TypeInt::make(num_elem-1));
> 807:   Node* bcast_mod_mask  = 
> gvn().transform(VectorNode::scalar2vector(mod_mask, num_elem, 
> shuffle_type_bt));
> 808: 

Remove redundant new line.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766272449
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766273205
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766273880
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766274718
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766275107
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766275345

Reply via email to