Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v6]

2022-11-01 Thread vpaprotsk
On Tue, 1 Nov 2022 23:49:17 GMT, Vladimir Ivanov wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2002: >> >>> 2000: } >>> 2001: >>> 2002: address StubGenerator::generate_poly1305_masksCP() { >> >> I suggest to turn it into a C++ literal constant and move the declaration >> next to

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-28 Thread vpaprotsk
On Wed, 26 Oct 2022 15:27:55 GMT, vpaprotsk wrote: >> src/java.base/share/classes/com/sun/crypto/provider/Poly1305.java line 296: >> >>> 294: keyBytes[12] &= (byte)252; >>> 295: >>> 296: // This should be enabled, but Poly1305KAT wo

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-28 Thread vpaprotsk
On Fri, 28 Oct 2022 19:46:33 GMT, vpaprotsk wrote: >> src/java.base/share/classes/com/sun/crypto/provider/Poly1305.java line 175: >> >>> 173: // Choice of 1024 is arbitrary, need enough data blocks to >>> amortize conversion overhead >>> 174:

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-28 Thread vpaprotsk
On Thu, 27 Oct 2022 21:19:06 GMT, Jamil Nimeh wrote: >>> 10% is not a negligible impact. I see your point about AVX512 reaping the >>> rewards of this change, but there are plenty of x86_64 systems without >>> AVX512 that will be impacted, not to mention other platforms like aarch64 >>> which

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-28 Thread vpaprotsk
On Thu, 27 Oct 2022 09:29:52 GMT, Jatin Bhateja wrote: >> vpaprotsk has updated the pull request incrementally with one additional >> commit since the last revision: >> >> extra whitespace character > > src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2040:

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-28 Thread vpaprotsk
On Wed, 26 Oct 2022 15:47:28 GMT, vpaprotsk wrote: >> src/hotspot/cpu/x86/macroAssembler_x86_poly.cpp line 806: >> >>> 804: evmovdquq(A0, Address(rsp, 64*0), Assembler::AVX_512bit); >>> 805: evmovdquq(A0, Address(rsp, 64*1), Assembler::AVX_512bit); >>>

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-28 Thread vpaprotsk
On Mon, 24 Oct 2022 23:38:16 GMT, Sandhya Viswanathan wrote: >> vpaprotsk has updated the pull request incrementally with one additional >> commit since the last revision: >> >> extra whitespace character > > src/hotspot/cpu/x86/assembler_x86.cpp line

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v6]

2022-10-28 Thread vpaprotsk
pt8 1770028.718 ± > 100847.766 ops/s > Poly1305DigestBench.digest 16384 thrpt8 765547.287 ± > 25883.825 ops/s > Poly1305DigestBench.digest 1048576 thrpt814508.458 ± > 56.147 ops/s vpaprotsk has updated the pull request in

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-28 Thread vpaprotsk
On Thu, 27 Oct 2022 09:33:32 GMT, Jatin Bhateja wrote: >> vpaprotsk has updated the pull request incrementally with one additional >> commit since the last revision: >> >> extra whitespace character > > src/hotspot/cpu/x86/macroAssembler_x86_poly.cpp line 8

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-28 Thread vpaprotsk
On Thu, 27 Oct 2022 05:10:59 GMT, Jatin Bhateja wrote: >> vpaprotsk has updated the pull request incrementally with one additional >> commit since the last revision: >> >> extra whitespace character > > src/java.base/share/classes/com/sun/crypto/provider/Pol

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-26 Thread vpaprotsk
On Tue, 25 Oct 2022 21:48:47 GMT, Jamil Nimeh wrote: >> vpaprotsk has updated the pull request incrementally with one additional >> commit since the last revision: >> >> extra whitespace character > > src/java.base/share/classes/com/sun/crypto/provider/Pol

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-26 Thread vpaprotsk
On Tue, 25 Oct 2022 23:48:49 GMT, Sandhya Viswanathan wrote: >> vpaprotsk has updated the pull request incrementally with one additional >> commit since the last revision: >> >> extra whitespace character > > src/hotspot/cpu/x86/macroAssembler_x86_poly.cpp li

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-26 Thread vpaprotsk
On Tue, 25 Oct 2022 21:57:34 GMT, Jamil Nimeh wrote: >> vpaprotsk has updated the pull request incrementally with one additional >> commit since the last revision: >> >> extra whitespace character > > src/java.base/share/classes/com/sun/crypto/provider/Pol

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v5]

2022-10-24 Thread vpaprotsk
pt8 1770028.718 ± > 100847.766 ops/s > Poly1305DigestBench.digest 16384 thrpt8 765547.287 ± > 25883.825 ops/s > Poly1305DigestBench.digest 1048576 thrpt814508.458 ± > 56.147 ops/s vpaprotsk has updated the pull request

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v3]

2022-10-24 Thread vpaprotsk
On Mon, 24 Oct 2022 20:31:31 GMT, Sandhya Viswanathan wrote: >> vpaprotsk has refreshed the contents of this pull request, and previous >> commits have been removed. The incremental views will show differences >> compared to the previous content of the PR. The pull re

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v4]

2022-10-24 Thread vpaprotsk
On Tue, 18 Oct 2022 23:03:55 GMT, Sandhya Viswanathan wrote: >> vpaprotsk has updated the pull request with a new target base due to a merge >> or a rebase. The pull request now contains eight commits: >> >> - assembler checks and test case fixes >> - Merge

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v4]

2022-10-24 Thread vpaprotsk
pt8 1770028.718 ± > 100847.766 ops/s > Poly1305DigestBench.digest 16384 thrpt8 765547.287 ± > 25883.825 ops/s > Poly1305DigestBench.digest 1048576 thrpt814508.458 ± > 56.147 ops/s vpaprotsk has updated the pull request with a new

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v4]

2022-10-24 Thread vpaprotsk
On Tue, 18 Oct 2022 06:26:38 GMT, Jatin Bhateja wrote: >> vpaprotsk has updated the pull request with a new target base due to a merge >> or a rebase. The pull request now contains eight commits: >> >> - assembler checks and test case fixes >> - Merge remote-

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions

2022-10-21 Thread vpaprotsk
On Fri, 21 Oct 2022 18:20:10 GMT, Vladimir Kozlov wrote: >> Handcrafted x86_64 asm for Poly1305. Main optimization is to process 16 >> message blocks at a time. For more details, left a lot of comments in >> `macroAssembler_x86_poly.cpp`. >> >> - Added new KAT test for Poly1305 and a fuzz test

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v3]

2022-10-21 Thread vpaprotsk
pt8 1770028.718 ± > 100847.766 ops/s > Poly1305DigestBench.digest 16384 thrpt8 765547.287 ± > 25883.825 ops/s > Poly1305DigestBench.digest 1048576 thrpt814508.458 ± > 56.147 ops/s vpaprotsk has refreshed the contents of this pul

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions

2022-10-21 Thread vpaprotsk
On Wed, 5 Oct 2022 21:28:26 GMT, vpaprotsk wrote: > Handcrafted x86_64 asm for Poly1305. Main optimization is to process 16 > message blocks at a time. For more details, left a lot of comments in > `macroAssembler_x86_poly.cpp`. > > - Added new KAT test for Poly1305 and a fuzz

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v2]

2022-10-21 Thread vpaprotsk
pt8 1770028.718 ± > 100847.766 ops/s > Poly1305DigestBench.digest 16384 thrpt8 765547.287 ± > 25883.825 ops/s > Poly1305DigestBench.digest 1048576 thrpt814508.458 ± > 56.147 ops/s vpaprotsk has updated the pull request i

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions

2022-10-21 Thread vpaprotsk
On Fri, 21 Oct 2022 09:57:14 GMT, Tobias Hartmann wrote: >> Handcrafted x86_64 asm for Poly1305. Main optimization is to process 16 >> message blocks at a time. For more details, left a lot of comments in >> `macroAssembler_x86_poly.cpp`. >> >> - Added new KAT test for Poly1305 and a fuzz test

Re: RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions

2022-10-14 Thread vpaprotsk
On Wed, 5 Oct 2022 21:28:26 GMT, vpaprotsk wrote: > Handcrafted x86_64 asm for Poly1305. Main optimization is to process 16 > message blocks at a time. For more details, left a lot of comments in > `macroAssembler_x86_poly.cpp`. > > - Added new KAT test for Poly1305 and a fuzz

RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions

2022-10-14 Thread vpaprotsk
thrpt814508.458 ± 56.147 ops/s - Commit messages: - missed white-space fix - - Fix whitespace and copyright statements - Merge remote-tracking branch 'vpaprotsk/master' into avx512-poly - Poly1305 AVX512 intrinsic for x86_64 Changes: https://git.openjdk.org/