Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]

Sandhya Viswanathan Mon, 26 Aug 2024 16:16:10 -0700

On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja <[email protected]> wrote:


>> Hi All,
>> 
>> As per the discussion on panama-dev mailing list[1], patch adds the support 
>> following new vector operators.
>> 
>> 
>>      . SUADD   : Saturating unsigned addition.
>>      . SADD    : Saturating signed addition. 
>>      . SUSUB   : Saturating unsigned subtraction.
>>      . SSUB    : Saturating signed subtraction.
>>      . UMAX    : Unsigned max
>>      . UMIN    : Unsigned min.
>>      
>> 
>> New vector operators are applicable to only integral types since their 
>> values wraparound in over/underflowing scenarios after setting appropriate 
>> status flags. For floating point types, as per IEEE 754 specs there are 
>> multiple schemes to handler underflow, one of them is gradual underflow 
>> which transitions the value to subnormal range. Similarly, overflow 
>> implicitly saturates the floating-point value to an Infinite value.
>> 
>> As the name suggests, these are saturating operations, i.e. the result of 
>> the computation is strictly capped by lower and upper bounds of the result 
>> type and is not wrapped around in underflowing or overflowing scenarios.
>> 
>> Summary of changes:
>> - Java side implementation of new vector operators.
>> - Add new scalar saturating APIs for each of the above saturating vector 
>> operator in corresponding primitive box classes, fallback implementation of 
>> vector operators is based over it.
>> - C2 compiler IR and inline expander changes.
>> - Optimized x86 backend implementation for new vector operators and their 
>> predicated counterparts.
>> - Extends existing VectorAPI Jtreg test suite to cover new operations.
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> PS: Intrinsification and auto-vectorization of new core-lib API will be 
>> addressed separately in a follow-up patch.
>> 
>> [1] https://mail.openjdk.org/pipermail/panama-dev/2024-May/020408.html
>
> Jatin Bhateja has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Review comments resolutions.

src/hotspot/cpu/x86/assembler_x86.cpp line 3454:

> 3452: 
> 3453: void Assembler::evmovdquw(XMMRegister dst, KRegister mask, XMMRegister 
> src, bool merge, int vector_len) {
> 3454:   assert(VM_Version::supports_avx512vlbw(), "");

vl not needed for 512 bit.

src/hotspot/cpu/x86/assembler_x86.cpp line 4583:

> 4581: void Assembler::evpcmpgtb(KRegister kdst, XMMRegister nds, Address src, 
> int vector_len) {
> 4582:   assert(VM_Version::supports_avx512vlbw(), "");
> 4583:   assert(vector_len == Assembler::AVX_512bit || 
> VM_Version::supports_avx512vl(), "");

The check for supports_avx512vlbw() at previous line in this function need to 
be changed to supports_avx512bw(). 
If it helps, the vl check is already happening in vex_prefix() if we use the 
higher bank registers for length < 512 bit.

src/hotspot/cpu/x86/assembler_x86.cpp line 4596:

> 4594: void Assembler::evpcmpgtb(KRegister kdst, KRegister mask, XMMRegister 
> nds, Address src, int vector_len) {
> 4595:   assert(VM_Version::supports_avx512vlbw(), "");
> 4596:   assert(vector_len == Assembler::AVX_512bit || 
> VM_Version::supports_avx512vl(), "");

The check for supports_avx512vlbw() at previous line in this function need to 
be changed to supports_avx512bw().

src/hotspot/cpu/x86/assembler_x86.cpp line 4611:

> 4609: void Assembler::evpcmpub(KRegister kdst, XMMRegister nds, XMMRegister 
> src, ComparisonPredicate vcc, int vector_len) {
> 4610:   assert(vector_len == Assembler::AVX_512bit || 
> VM_Version::supports_avx512vl(), "");
> 4611:   assert(VM_Version::supports_avx512vlbw(), "");

I think you meant this to be supports_avx512bw().

src/hotspot/cpu/x86/assembler_x86.cpp line 4620:

> 4618: void Assembler::evpcmpuw(KRegister kdst, XMMRegister nds, XMMRegister 
> src, ComparisonPredicate vcc, int vector_len) {
> 4619:   assert(vector_len == Assembler::AVX_512bit || 
> VM_Version::supports_avx512vl(), "");
> 4620:   assert(VM_Version::supports_avx512vlbw(), "");

The check for supports_avx512vlbw() in this function need to be changed to 
supports_avx512bw().

src/hotspot/cpu/x86/assembler_x86.cpp line 4645:

> 4643: void Assembler::evpcmpuw(KRegister kdst, XMMRegister nds, Address src, 
> ComparisonPredicate vcc, int vector_len) {
> 4644:   assert(VM_Version::supports_avx512vlbw(), "");
> 4645:   assert(vector_len == Assembler::AVX_512bit || 
> VM_Version::supports_avx512vl(), "");

The check for supports_avx512vlbw() at previous line in this function need to 
be changed to supports_avx512bw().

src/hotspot/cpu/x86/assembler_x86.cpp line 4672:

> 4670: void Assembler::evpcmpeqb(KRegister kdst, KRegister mask, XMMRegister 
> nds, Address src, int vector_len) {
> 4671:   assert(VM_Version::supports_avx512vlbw(), "");
> 4672:   assert(vector_len == Assembler::AVX_512bit || 
> VM_Version::supports_avx512vl(), "");

The check for supports_avx512vlbw() at previous line in this function need to 
be changed to supports_avx512bw().

src/hotspot/cpu/x86/assembler_x86.cpp line 8191:

> 8189: void Assembler::vpminub(XMMRegister dst, XMMRegister nds, XMMRegister 
> src, int vector_len) {
> 8190:   assert(UseAVX > 0 && (vector_len == Assembler::AVX_512bit || 
> (!needs_evex(dst, nds, src) || VM_Version::supports_avx512vl())), "");
> 8191:   assert(!needs_evex(dst, nds, src) || VM_Version::supports_avx512bw(), 
> "");

It will be good to keep the assert similar to vpaddsb for new vmin/vmax 
instructions.

src/hotspot/cpu/x86/assembler_x86.cpp line 8311:

> 8309: }
> 8310: 
> 8311: void Assembler::evpminud(XMMRegister dst, KRegister mask, XMMRegister 
> nds, Address src, bool merge, int vector_len) {

assert(VM_Version::supports_evex(), "") check missing.

src/hotspot/cpu/x86/assembler_x86.cpp line 8340:

> 8338: 
> 8339: void Assembler::evpminuq(XMMRegister dst, KRegister mask, XMMRegister 
> nds, Address src, bool merge, int vector_len) {
> 8340:   assert(vector_len == AVX_512bit || VM_Version::supports_avx512vl(), 
> "");

assert(VM_Version::supports_evex(), "") check missing.

src/hotspot/cpu/x86/assembler_x86.cpp line 8402:

> 8400: void Assembler::vpmaxuw(XMMRegister dst, XMMRegister nds, XMMRegister 
> src, int vector_len) {
> 8401:   assert(vector_len == AVX_128bit ? VM_Version::supports_avx() :
> 8402:         (vector_len == AVX_256bit ? VM_Version::supports_avx2() : 
> VM_Version::supports_avx512bw()), "");

Why support_avx() check here only and not in other newly added  v* integral 
instructions? For avx1 platforms, integral vector width supported is only 
128bit.

src/hotspot/cpu/x86/assembler_x86.cpp line 8478:

> 8476: 
> 8477: void Assembler::evpmaxud(XMMRegister dst, KRegister mask, XMMRegister 
> nds, Address src, bool merge, int vector_len) {
> 8478:   assert(vector_len == AVX_512bit || VM_Version::supports_avx512vl(), 
> "");

assert(VM_Version::supports_evex(), "") is  missing.

src/hotspot/cpu/x86/assembler_x86.cpp line 8506:

> 8504: 
> 8505: void Assembler::evpmaxuq(XMMRegister dst, KRegister mask, XMMRegister 
> nds, Address src, bool merge, int vector_len) {
> 8506:   assert(vector_len == AVX_512bit || VM_Version::supports_avx512vl(), 
> "");

assert(VM_Version::supports_evex(), "") is missing.

src/hotspot/cpu/x86/assembler_x86.cpp line 10229:

> 10227:   InstructionMark im(this);
> 10228:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");
> 10229:   InstructionAttr attributes(vector_len, /* vex_w */ true,/* 
> legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);

vex_w could be false here.

src/hotspot/cpu/x86/assembler_x86.cpp line 10256:

> 10254:   InstructionMark im(this);
> 10255:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");
> 10256:   InstructionAttr attributes(vector_len, /* vex_w */ true,/* 
> legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);

vex_w could be false here.

src/hotspot/cpu/x86/assembler_x86.cpp line 10283:

> 10281:   InstructionMark im(this);
> 10282:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");
> 10283:   InstructionAttr attributes(vector_len, /* vex_w */ true,/* 
> legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);

vex_w could be false here.

src/hotspot/cpu/x86/assembler_x86.cpp line 10310:

> 10308:   InstructionMark im(this);
> 10309:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");
> 10310:   InstructionAttr attributes(vector_len, /* vex_w */ true,/* 
> legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);

vex_w could be false here.

src/hotspot/cpu/x86/assembler_x86.cpp line 10337:

> 10335:   InstructionMark im(this);
> 10336:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");
> 10337:   InstructionAttr attributes(vector_len, /* vex_w */ true,/* 
> legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);

vex_w could be false here.

src/hotspot/cpu/x86/assembler_x86.cpp line 10364:

> 10362:   InstructionMark im(this);
> 10363:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");
> 10364:   InstructionAttr attributes(vector_len, /* vex_w */ true,/* 
> legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);

vex_w could be false here.

src/hotspot/cpu/x86/assembler_x86.cpp line 10391:

> 10389:   InstructionMark im(this);
> 10390:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");
> 10391:   InstructionAttr attributes(vector_len, /* vex_w */ true,/* 
> legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);

vex_w could be false here.

src/hotspot/cpu/x86/assembler_x86.cpp line 10419:

> 10417:   InstructionMark im(this);
> 10418:   assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");
> 10419:   InstructionAttr attributes(vector_len, /* vex_w */ true,/* 
> legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true);

vex_w could be false here.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731912227
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731608860
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731609177
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731917735
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731612730
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731726012
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731726337
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731748671
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731769490
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731771330
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731823750
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731870793
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731870288
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731888852
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731889468
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731890265
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731909994
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731910246
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731910516
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731910755
PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731911129

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]

Reply via email to