On Wed, 16 Nov 2022 23:41:32 GMT, Volodymyr Paprotski wrote:
>> Yes, please. And for the upper half of register file, just code it as a loop
>> over register range:
>>
>> for (int rxmm_num = 16; rxmm_num < 30; rxmm_num++) {
>> XMMRegister rxmm = as_XMMRegister(rxmm_num);
>> __ vpxorq(rxmm,
On Wed, 16 Nov 2022 23:16:14 GMT, Volodymyr Paprotski wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp line 756:
>>
>>> 754:
>>> 755: // Store R^8-R for later use
>>> 756: __ evmovdquq(Address(rsp, 64*0), B0, Assembler::AVX_512bit);
>>
>> Could these vector spills be eliminated?
On Wed, 16 Nov 2022 23:39:00 GMT, Vladimir Ivanov wrote:
>> ah.. I remember thinking about doing that.. `vzeroall` isnt encoded yet and
>> I figured since I already have to do the xmm16-29, might as well do them
>> all.. should I add that instruction too?
>
> Yes, please. And for the upper half
On Wed, 16 Nov 2022 23:14:45 GMT, Volodymyr Paprotski wrote:
>> Or simply switch to `vzeroall` for `xmm0` - `xmm15`.
>
> ah.. I remember thinking about doing that.. `vzeroall` isnt encoded yet and I
> figured since I already have to do the xmm16-29, might as well do them all..
> should I add th
On Wed, 16 Nov 2022 23:08:16 GMT, Vladimir Ivanov wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp line 917:
>>
>>> 915: // Cleanup
>>> 916: __ vpxorq(xmm0, xmm0, xmm0, Assembler::AVX_512bit);
>>> 917: __ vpxorq(xmm1, xmm1, xmm1, Assembler::AVX_512bit);
>>
>> You could use T0,
On Wed, 16 Nov 2022 23:12:28 GMT, Vladimir Ivanov wrote:
>> Volodymyr Paprotski has updated the pull request incrementally with one
>> additional commit since the last revision:
>>
>> redo register alloc with explicit func params
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp line 756:
On Wed, 16 Nov 2022 20:52:14 GMT, Volodymyr Paprotski wrote:
>> Handcrafted x86_64 asm for Poly1305. Main optimization is to process 16
>> message blocks at a time. For more details, left a lot of comments in
>> `macroAssembler_x86_poly.cpp`.
>>
>> - Added new KAT test for Poly1305 and a fuzz
On Wed, 16 Nov 2022 22:47:37 GMT, Sandhya Viswanathan
wrote:
>> Volodymyr Paprotski has updated the pull request incrementally with one
>> additional commit since the last revision:
>>
>> redo register alloc with explicit func params
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp line
On Wed, 16 Nov 2022 20:52:14 GMT, Volodymyr Paprotski wrote:
>> Handcrafted x86_64 asm for Poly1305. Main optimization is to process 16
>> message blocks at a time. For more details, left a lot of comments in
>> `macroAssembler_x86_poly.cpp`.
>>
>> - Added new KAT test for Poly1305 and a fuzz
> Handcrafted x86_64 asm for Poly1305. Main optimization is to process 16
> message blocks at a time. For more details, left a lot of comments in
> `macroAssembler_x86_poly.cpp`.
>
> - Added new KAT test for Poly1305 and a fuzz test to compare intrinsic and
> java.
> - Would like to add an `I
10 matches
Mail list logo