> This provides a solid speedup of about 3-4x over the Java implementation. > > I have a vectorized version of this which uses a bunch of tricks to speed it > up, but it's complex and can still be improved. We're getting close to ramp > down, so I'm submitting this simple intrinsic so that we can get it reviewed > in time. > > Benchmarks: > > > ThunderX (2, I think): > > Benchmark (dataSize) (provider) Mode Cnt > Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 > 14078352.014 ± 4201407.966 ops/s > Poly1305DigestBench.updateBytes 256 thrpt 3 > 5154958.794 ± 1717146.980 ops/s > Poly1305DigestBench.updateBytes 1024 thrpt 3 > 1416563.273 ± 1311809.454 ops/s > Poly1305DigestBench.updateBytes 16384 thrpt 3 > 94059.570 ± 2913.021 ops/s > Poly1305DigestBench.updateBytes 1048576 thrpt 3 > 1441.024 ± 164.443 ops/s > > Benchmark (dataSize) (provider) Mode Cnt > Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 > 4516486.795 ± 419624.224 ops/s > Poly1305DigestBench.updateBytes 256 thrpt 3 > 1228542.774 ± 202815.694 ops/s > Poly1305DigestBench.updateBytes 1024 thrpt 3 > 316051.912 ± 23066.449 ops/s > Poly1305DigestBench.updateBytes 16384 thrpt 3 > 20649.561 ± 1094.687 ops/s > Poly1305DigestBench.updateBytes 1048576 thrpt 3 > 310.564 ± 31.053 ops/s > > Apple M1: > > Benchmark (dataSize) (provider) Mode Cnt > Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 > 33551968.946 ± 849843.905 ops/s > Poly1305DigestBench.updateBytes 256 thrpt 3 > 9911637.214 ± 63417.224 ops/s > Poly1305DigestBench.updateBytes 1024 thrpt 3 > 2604370.740 ± 29208.265 ops/s > Poly1305DigestBench.updateBytes 16384 thrpt 3 > 165183.633 ± 1975.998 ops/s > Poly1305DigestBench.updateBytes 1048576 thrpt 3 > 2587.132 ± 40.240 ops/s > > Benchmark (dataSize) (provider) Mode Cnt > Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 > 12373649.589 ± 184757.721 ops/s > Poly1305DigestBench.updateBytes 256 th...
Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: Review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14085/files - new: https://git.openjdk.org/jdk/pull/14085/files/9cc899b9..93a03c62 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14085&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14085&range=02-03 Stats: 26 lines in 1 file changed: 19 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14085.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14085/head:pull/14085 PR: https://git.openjdk.org/jdk/pull/14085