On Mon, 31 Oct 2022 02:35:18 GMT, Quan Anh Mai <qa...@openjdk.org> wrote:

>> Claes Redestad has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Require UseSSE >= 3 due transitive use of sse3 instructions from ReduceI
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 3493:
> 
>> 3491:   // vnext = IntVector.broadcast(I256, power_of_31_backwards[0]);
>> 3492:   movdl(vnext, InternalAddress(power_of_31_backwards + (0 * 
>> sizeof(jint))));
>> 3493:   vpbroadcastd(vnext, vnext, Assembler::AVX_256bit);
> 
> `vpbroadcastd` can take an `Address` argument instead.

An `InternalAddress` isn't an `Address` but an `AddressLiteral`. You can 
however do `as_Address(InternalAddress(power_of_31_backwards + (0 * 
sizeof(jint))))`

> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 3528:
> 
>> 3526:     vpmulld(vcoef[idx], vcoef[idx], vnext, Assembler::AVX_256bit);
>> 3527:   }
>> 3528:   jmp(LONG_VECTOR_LOOP_BEGIN);
> 
> Calculating backward forces you to do calculating the coefficients on each 
> iteration, I think doing this normally would be better.

But doing it forward requires a `reduceLane` on each iteration. It's faster to 
do it backward.

-------------

PR: https://git.openjdk.org/jdk/pull/10847

Reply via email to