On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD : Saturating signed addition. >> . SUSUB : Saturating unsigned subtraction. >> . SSUB : Saturating signed subtraction. >> . UMAX : Unsigned max >> . UMIN : Unsigned min. >> >> >> New vector operators are applicable to only integral types since their >> values wraparound in over/underflowing scenarios after setting appropriate >> status flags. For floating point types, as per IEEE 754 specs there are >> multiple schemes to handler underflow, one of them is gradual underflow >> which transitions the value to subnormal range. Similarly, overflow >> implicitly saturates the floating-point value to an Infinite value. >> >> As the name suggests, these are saturating operations, i.e. the result of >> the computation is strictly capped by lower and upper bounds of the result >> type and is not wrapped around in underflowing or overflowing scenarios. >> >> Summary of changes: >> - Java side implementation of new vector operators. >> - Add new scalar saturating APIs for each of the above saturating vector >> operator in corresponding primitive box classes, fallback implementation of >> vector operators is based over it. >> - C2 compiler IR and inline expander changes. >> - Optimized x86 backend implementation for new vector operators and their >> predicated counterparts. >> - Extends existing VectorAPI Jtreg test suite to cover new operations. >> >> Kindly review and share your feedback. >> >> Best Regards, >> PS: Intrinsification and auto-vectorization of new core-lib API will be >> addressed separately in a follow-up patch. >> >> [1] https://mail.openjdk.org/pipermail/panama-dev/2024-May/020408.html > > Jatin Bhateja has updated the pull request incrementally with one additional > commit since the last revision: > > Review comments resolutions. src/hotspot/cpu/x86/assembler_x86.cpp line 3454: > 3452: > 3453: void Assembler::evmovdquw(XMMRegister dst, KRegister mask, XMMRegister > src, bool merge, int vector_len) { > 3454: assert(VM_Version::supports_avx512vlbw(), ""); vl not needed for 512 bit. src/hotspot/cpu/x86/assembler_x86.cpp line 4583: > 4581: void Assembler::evpcmpgtb(KRegister kdst, XMMRegister nds, Address src, > int vector_len) { > 4582: assert(VM_Version::supports_avx512vlbw(), ""); > 4583: assert(vector_len == Assembler::AVX_512bit || > VM_Version::supports_avx512vl(), ""); The check for supports_avx512vlbw() at previous line in this function need to be changed to supports_avx512bw(). If it helps, the vl check is already happening in vex_prefix() if we use the higher bank registers for length < 512 bit. src/hotspot/cpu/x86/assembler_x86.cpp line 4596: > 4594: void Assembler::evpcmpgtb(KRegister kdst, KRegister mask, XMMRegister > nds, Address src, int vector_len) { > 4595: assert(VM_Version::supports_avx512vlbw(), ""); > 4596: assert(vector_len == Assembler::AVX_512bit || > VM_Version::supports_avx512vl(), ""); The check for supports_avx512vlbw() at previous line in this function need to be changed to supports_avx512bw(). src/hotspot/cpu/x86/assembler_x86.cpp line 4611: > 4609: void Assembler::evpcmpub(KRegister kdst, XMMRegister nds, XMMRegister > src, ComparisonPredicate vcc, int vector_len) { > 4610: assert(vector_len == Assembler::AVX_512bit || > VM_Version::supports_avx512vl(), ""); > 4611: assert(VM_Version::supports_avx512vlbw(), ""); I think you meant this to be supports_avx512bw(). src/hotspot/cpu/x86/assembler_x86.cpp line 4620: > 4618: void Assembler::evpcmpuw(KRegister kdst, XMMRegister nds, XMMRegister > src, ComparisonPredicate vcc, int vector_len) { > 4619: assert(vector_len == Assembler::AVX_512bit || > VM_Version::supports_avx512vl(), ""); > 4620: assert(VM_Version::supports_avx512vlbw(), ""); The check for supports_avx512vlbw() in this function need to be changed to supports_avx512bw(). src/hotspot/cpu/x86/assembler_x86.cpp line 4645: > 4643: void Assembler::evpcmpuw(KRegister kdst, XMMRegister nds, Address src, > ComparisonPredicate vcc, int vector_len) { > 4644: assert(VM_Version::supports_avx512vlbw(), ""); > 4645: assert(vector_len == Assembler::AVX_512bit || > VM_Version::supports_avx512vl(), ""); The check for supports_avx512vlbw() at previous line in this function need to be changed to supports_avx512bw(). src/hotspot/cpu/x86/assembler_x86.cpp line 4672: > 4670: void Assembler::evpcmpeqb(KRegister kdst, KRegister mask, XMMRegister > nds, Address src, int vector_len) { > 4671: assert(VM_Version::supports_avx512vlbw(), ""); > 4672: assert(vector_len == Assembler::AVX_512bit || > VM_Version::supports_avx512vl(), ""); The check for supports_avx512vlbw() at previous line in this function need to be changed to supports_avx512bw(). src/hotspot/cpu/x86/assembler_x86.cpp line 8191: > 8189: void Assembler::vpminub(XMMRegister dst, XMMRegister nds, XMMRegister > src, int vector_len) { > 8190: assert(UseAVX > 0 && (vector_len == Assembler::AVX_512bit || > (!needs_evex(dst, nds, src) || VM_Version::supports_avx512vl())), ""); > 8191: assert(!needs_evex(dst, nds, src) || VM_Version::supports_avx512bw(), > ""); It will be good to keep the assert similar to vpaddsb for new vmin/vmax instructions. src/hotspot/cpu/x86/assembler_x86.cpp line 8311: > 8309: } > 8310: > 8311: void Assembler::evpminud(XMMRegister dst, KRegister mask, XMMRegister > nds, Address src, bool merge, int vector_len) { assert(VM_Version::supports_evex(), "") check missing. src/hotspot/cpu/x86/assembler_x86.cpp line 8340: > 8338: > 8339: void Assembler::evpminuq(XMMRegister dst, KRegister mask, XMMRegister > nds, Address src, bool merge, int vector_len) { > 8340: assert(vector_len == AVX_512bit || VM_Version::supports_avx512vl(), > ""); assert(VM_Version::supports_evex(), "") check missing. src/hotspot/cpu/x86/assembler_x86.cpp line 8402: > 8400: void Assembler::vpmaxuw(XMMRegister dst, XMMRegister nds, XMMRegister > src, int vector_len) { > 8401: assert(vector_len == AVX_128bit ? VM_Version::supports_avx() : > 8402: (vector_len == AVX_256bit ? VM_Version::supports_avx2() : > VM_Version::supports_avx512bw()), ""); Why support_avx() check here only and not in other newly added v* integral instructions? For avx1 platforms, integral vector width supported is only 128bit. src/hotspot/cpu/x86/assembler_x86.cpp line 8478: > 8476: > 8477: void Assembler::evpmaxud(XMMRegister dst, KRegister mask, XMMRegister > nds, Address src, bool merge, int vector_len) { > 8478: assert(vector_len == AVX_512bit || VM_Version::supports_avx512vl(), > ""); assert(VM_Version::supports_evex(), "") is missing. src/hotspot/cpu/x86/assembler_x86.cpp line 8506: > 8504: > 8505: void Assembler::evpmaxuq(XMMRegister dst, KRegister mask, XMMRegister > nds, Address src, bool merge, int vector_len) { > 8506: assert(vector_len == AVX_512bit || VM_Version::supports_avx512vl(), > ""); assert(VM_Version::supports_evex(), "") is missing. src/hotspot/cpu/x86/assembler_x86.cpp line 10229: > 10227: InstructionMark im(this); > 10228: assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit > || VM_Version::supports_avx512vl()), ""); > 10229: InstructionAttr attributes(vector_len, /* vex_w */ true,/* > legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true); vex_w could be false here. src/hotspot/cpu/x86/assembler_x86.cpp line 10256: > 10254: InstructionMark im(this); > 10255: assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit > || VM_Version::supports_avx512vl()), ""); > 10256: InstructionAttr attributes(vector_len, /* vex_w */ true,/* > legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true); vex_w could be false here. src/hotspot/cpu/x86/assembler_x86.cpp line 10283: > 10281: InstructionMark im(this); > 10282: assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit > || VM_Version::supports_avx512vl()), ""); > 10283: InstructionAttr attributes(vector_len, /* vex_w */ true,/* > legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true); vex_w could be false here. src/hotspot/cpu/x86/assembler_x86.cpp line 10310: > 10308: InstructionMark im(this); > 10309: assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit > || VM_Version::supports_avx512vl()), ""); > 10310: InstructionAttr attributes(vector_len, /* vex_w */ true,/* > legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true); vex_w could be false here. src/hotspot/cpu/x86/assembler_x86.cpp line 10337: > 10335: InstructionMark im(this); > 10336: assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit > || VM_Version::supports_avx512vl()), ""); > 10337: InstructionAttr attributes(vector_len, /* vex_w */ true,/* > legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true); vex_w could be false here. src/hotspot/cpu/x86/assembler_x86.cpp line 10364: > 10362: InstructionMark im(this); > 10363: assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit > || VM_Version::supports_avx512vl()), ""); > 10364: InstructionAttr attributes(vector_len, /* vex_w */ true,/* > legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true); vex_w could be false here. src/hotspot/cpu/x86/assembler_x86.cpp line 10391: > 10389: InstructionMark im(this); > 10390: assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit > || VM_Version::supports_avx512vl()), ""); > 10391: InstructionAttr attributes(vector_len, /* vex_w */ true,/* > legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true); vex_w could be false here. src/hotspot/cpu/x86/assembler_x86.cpp line 10419: > 10417: InstructionMark im(this); > 10418: assert(VM_Version::supports_avx512bw() && (vector_len == AVX_512bit > || VM_Version::supports_avx512vl()), ""); > 10419: InstructionAttr attributes(vector_len, /* vex_w */ true,/* > legacy_mode */ false, /* no_mask_reg */ false,/* uses_vl */ true); vex_w could be false here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731912227 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731608860 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731609177 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731917735 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731612730 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731726012 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731726337 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731748671 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731769490 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731771330 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731823750 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731870793 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731870288 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731888852 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731889468 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731890265 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731909994 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731910246 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731910516 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731910755 PR Review Comment: https://git.openjdk.org/jdk/pull/20507#discussion_r1731911129