Re: RFR: 8355216: Accelerate P-256 arithmetic on aarch64 [v11]

Ferenc Rakoczi Thu, 18 Jun 2026 09:18:56 -0700

> An aarch64 implementation of the MontgomeryIntegerPolynomial256.mult() method 
> and IntegerPolynomial.conditionalAssign(). Since 64-bit multiplication is not 
> supported on Neon and manually performing this operation with 32-bit limbs is 
> slower than with GPRs, a hybrid neon/gpr approach is used. Neon instructions 
> are used to compute intermediate values used in the last two iterations of 
> the main "loop", while the GPRs compute the first few iterations. At the 
> method level this improves performance by ~9% and at the API level roughly 5%.
> 
> 
> 
> ---------
> - [x] I confirm that I make this contribution in accordance with the [OpenJDK 
> Interim AI Policy](https://openjdk.org/legal/ai).


Ferenc Rakoczi has updated the pull request incrementally with one additional 
commit since the last revision:

  Unite x86 and aarch64 for UseIntPolyIntrinsics for AOTCache.

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/30941/files
  - new: https://git.openjdk.org/jdk/pull/30941/files/2c244066..5b495b00

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=30941&range=10
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=30941&range=09-10

  Stats: 3 lines in 1 file changed: 1 ins; 2 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/30941.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/30941/head:pull/30941

PR: https://git.openjdk.org/jdk/pull/30941

Re: RFR: 8355216: Accelerate P-256 arithmetic on aarch64 [v11]

Reply via email to