On Wed, 17 Jun 2026 00:02:25 GMT, Shawn Emery <[email protected]> wrote:
>> Curve25519 polynomial arithmetic is performed with intrinsincs implemented >> in GPR related instructions for multiplication operations (method mult()). >> Benchmark improvements include: >> >> X25519 decapsulation: +9% >> X25519 encapsulation: +9% >> X22519 key agreement: +7% >> X25519 key-pair generation: +10% >> X25519-MLKEM decapsulation: +7% >> X25519-MLKEM encapsulation: +8% >> X25519-MLKEM key-pair generation: +8% >> EdDSA sign: +12% >> EdDSA verify: +12% >> EdDSA key-pair generation: +15% >> >> Note 1: The difference between Aarch64 vs. x86_64 intrinsics implementation >> include the lack of square() intrinsics; usage caused a 3.3% performance >> regression due to the efficiencies of the symmetric squaring shape in Java >> vs. the inefficiencies of the leaf calls and the additional cycles required >> for 64 bit multiplication in Aarch64. >> Note 2: The GPR related instructions were optimal when compared to hybrid >> (GPR related instructions for the first two iterations and Neon instructions >> for the last two iterations) solution. This design produced a -4%/-1% >> performance drop in KEM decapsulation/encapsulation compared to the GPR >> related instructions where the overhead of performing the limb splits and >> reconstruction did not compensate enough for the efficiencies of SIMD >> parallelism. >> >> --------- >> - [X] I confirm that I make this contribution in accordance with the >> [OpenJDK Interim AI Policy](https://openjdk.org/legal/ai). > > Shawn Emery has updated the pull request incrementally with one additional > commit since the last revision: > > Update based on adinn's comments Good ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/31409#pullrequestreview-4513554114
