Re: RFR: 8318650: Optimized subword gather for x86 targets. [v3]

Jatin Bhateja Sun, 05 Nov 2023 05:18:02 -0800

On Fri, 3 Nov 2023 23:20:49 GMT, Sandhya Viswanathan <[email protected]> 
wrote:


>> Jatin Bhateja has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   Restricting masked sub-word gather to AVX512 target to align with integral 
>> gather support.
>
> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1576:
> 
>> 1574:     Label* larr[] = { &case0, &case1, &case2, &case3, &case4, &case5, 
>> &case6, &case7 };
>> 1575:     for (int i = 0; i < 8; i++) {
>> 1576:       bt(mask, midx);
> 
> Could we not use smaller length bt and inc instructions (e.g. 32 bit one) 
> here as we know that we dont need 64 bits of mask here? That way we will have 
> smaller instruction encoding.

I get your point it may save prefix byte for short vectors in one case, but 
REX2 may not be avoidable if allocator picks a register from higher register 
bank (r8-15), mask corresponding to Byte64 does need 64 bits.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1382573870

Re: RFR: 8318650: Optimized subword gather for x86 targets. [v3]

Reply via email to