On Sun, 1 Mar 2026 18:17:09 GMT, Shawn Emery <[email protected]> wrote:

>> This implementation changes the limb size of X25519 from 10 x 26 bits to 5 x 
>> 51 bits in order to take advantage of performance gains from a reduction in 
>> the number of limb operations.
>> 
>> Performance gains were observed from the key generation/agreement, 
>> encapsulation, and decapsulation benchmarks for  both aarch64 with 3 - 8% 
>> gains and x86_64 with 9% gains.
>> 
>> Thank you @ferakocz for their help in working through the early stages of 
>> this code with me.
>
> Shawn Emery has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   Remove IntrinsicCandidate for mult() method

It would be nice if you could add comments to the methods about their 
requirements for the inputs and what they produce as a result (even in the 
ancestor classes).

src/java.base/share/classes/sun/security/util/math/intpoly/IntegerPolynomial25519.java
 line 39:

> 37:     private static final long LIMB_MASK = -1L >>> (64 - BITS_PER_LIMB);
> 38: 
> 39:     private static final long[] one = new long[] {

If the method get1() is not overridden, this is unnecessary.

src/java.base/share/classes/sun/security/util/math/intpoly/IntegerPolynomial25519.java
 line 42:

> 40:         0x0000000000000001L, 0x0000000000000000L, 0x0000000000000000L,
> 41:         0x0000000000000000L, 0x0000000000000000L };
> 42:     private static final long[] zero = new long[] {

If the method get0() is not overridden, this is unnecessary.

src/java.base/share/classes/sun/security/util/math/intpoly/IntegerPolynomial25519.java
 line 60:

> 58: 
> 59:     @Override
> 60:     public ImmutableElement get0() {

Is it necessary to override this method?

src/java.base/share/classes/sun/security/util/math/intpoly/IntegerPolynomial25519.java
 line 65:

> 63: 
> 64:     @Override
> 65:     public ImmutableElement get1() {

Is it necessary to override this method?

src/java.base/share/classes/sun/security/util/math/intpoly/IntegerPolynomial25519.java
 line 83:

> 81:     // Overriden for performance (unroll and unnesting)
> 82:     @Override
> 83:     protected void carry(long[] limbs) {

Have you tried working with all positive limbs (as in 
IntegerPolynomial1305.java)?

src/java.base/share/classes/sun/security/util/math/intpoly/IntegerPolynomial25519.java
 line 327:

> 325: 
> 326:     @Override
> 327:     protected void square(long[] a, long[] r) {

I suggest changing the square() method to a more efficient one by computing the 
a_i * a_j (i != j) products only once and addig the result twice to the 
appropriate places in the result. Something like this (not tested):


    @Override
    protected void square(long[] a, long[] r) {
        long aa0 = a[0];
        long aa1 = a[1];
        long aa2 = a[2];
        long aa3 = a[3];
        long aa4 = a[4];

        final long shift1 = 64 - BITS_PER_LIMB;
        final long shift2 = BITS_PER_LIMB;

        long d0, d1, d2, d3, d4;      // low digits from multiplication
        long dd0, dd1, dd2, dd3, dd4; // high digits from multiplication
        // multiplication result digits for each column
        long c0, c1, c2, c3, c4, c5, c6, c7, c8, c9;

        // Row 0 - multiply by aa0
        d0 = aa0 * aa0;
        dd0 = Math.multiplyHigh(aa0, aa0) << shift1 | (d0 >>> shift2);
        d0 &= LIMB_MASK;

        d1 = aa0 * aa1;
        dd1 = Math.multiplyHigh(aa0, bb1) << shift1 | (d1 >>> shift2);
        d1 &= LIMB_MASK;

        d2 = aa0 * aa2;
        dd2 = Math.multiplyHigh(aa0, bb2) << shift1 | (d2 >>> shift2);
        d2 &= LIMB_MASK;

        d3 = aa0 * aa3;
        dd3 = Math.multiplyHigh(aa0, bb3) << shift1 | (d3 >>> shift2);
        d3 &= LIMB_MASK;

        d4 = aa0 * aa4;
        dd4 = Math.multiplyHigh(aa0, bb4) << shift1 | (d4 >>> shift2);
        d4 &= LIMB_MASK;

        c0 = d0;
        c1 = (d1 << 1) + dd0;
        c2 = (d2 + dd1) << 1;
        c3 = (d3 + dd2) << 1;
        c4 = (d4 + dd3) << 1;
        c5 = dd4 << 1;

        // Row 1 - multiply by aa1
        d1 = aa1 * aa1;
        dd1 = Math.multiplyHigh(aa1, bb1) << shift1 | (d1 >>> shift2);
        d1 &= LIMB_MASK;

        d2 = aa1 * aa2;
        dd2 = Math.multiplyHigh(aa1, bb2) << shift1 | (d2 >>> shift2);
        d2 &= LIMB_MASK;

        d3 = aa1 * aa3;
        dd3 = Math.multiplyHigh(aa1, bb3) << shift1 | (d3 >>> shift2);
        d3 &= LIMB_MASK;

        d4 = aa1 * aa4;
        dd4 = Math.multiplyHigh(aa1, bb4) << shift1 | (d4 >>> shift2);
        d4 &= LIMB_MASK;

        c2 += d1;
        c3 += (d2 << 1) + dd1;
        c4 += (d3 + dd2) << 1;
        c5 += (d4 + dd3) << 1;
        c6 = dd4 << 1;

        // Row 2 - multiply by aa2
        d2 = aa2 * aa2;
        dd2 = Math.multiplyHigh(aa2, bb2) << shift1 | (d2 >>> shift2);
        d2 &= LIMB_MASK;

        d3 = aa2 * aa3;
        dd3 = Math.multiplyHigh(aa2, bb3) << shift1 | (d3 >>> shift2);
        d3 &= LIMB_MASK;

        d4 = aa2 * aa4;
        dd4 = Math.multiplyHigh(aa2, bb4) << shift1 | (d4 >>> shift2);
        d4 &= LIMB_MASK;

        c4 += d2;
        c5 += (d3 << 1) + dd2;
        c6 += (d4 + dd3) << 1;
        c7 = dd4 << 1;

        // Row 3 - multiply by aa3
        d3 = aa3 * aa3;
        dd3 = Math.multiplyHigh(aa3, bb3) << shift1 | (d3 >>> shift2);
        d3 &= LIMB_MASK;

        d4 = aa3 * aa4;
        dd4 = Math.multiplyHigh(aa3, bb4) << shift1 | (d4 >>> shift2);
        d4 &= LIMB_MASK;

        c6 += d3;
        c7 += (d4 << 1) + dd3;
        c8 = dd4 << 1;

        // Row 4 - multiply by aa4
        d4 = aa4 * aa4;
        dd4 = Math.multiplyHigh(aa4, bb4) << shift1 | (d4 >>> shift2);
        d4 &= LIMB_MASK;

        c8 += d4;
        c9 = dd4;

        // Perform pseudo-Mersenne reduction
        r[0] = c0 + (19 * c5);
        r[1] = c1 + (19 * c6);
        r[2] = c2 + (19 * c7);
        r[3] = c3 + (19 * c8);
        r[4] = c4 + (19 * c9);

        reduce(r);
    }

-------------

PR Comment: https://git.openjdk.org/jdk/pull/29981#issuecomment-3984615568
PR Review Comment: https://git.openjdk.org/jdk/pull/29981#discussion_r2872567722
PR Review Comment: https://git.openjdk.org/jdk/pull/29981#discussion_r2872568897
PR Review Comment: https://git.openjdk.org/jdk/pull/29981#discussion_r2872390257
PR Review Comment: https://git.openjdk.org/jdk/pull/29981#discussion_r2872391902
PR Review Comment: https://git.openjdk.org/jdk/pull/29981#discussion_r2872333415
PR Review Comment: https://git.openjdk.org/jdk/pull/29981#discussion_r2872319370

Reply via email to