On Thu, 14 Nov 2024 23:24:33 GMT, Ben Perez <bpe...@openjdk.org> wrote:
>> Java implementation of ML-DSA, the FIPS 204 post-quantum signature scheme >> https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.204.pdf. Depends on >> https://github.com/openjdk/jdk/pull/21167 > > Ben Perez has updated the pull request incrementally with one additional > commit since the last revision: > > Updated copyrite, changed classes in ML_DSA_Impls to sealed Just a couple nit comments, not crucial to the initial integration. src/java.base/share/classes/sun/security/provider/ML_DSA.java line 1029: > 1027: int tmp = montMul(MONT_ZETAS_FOR_NTT[m], coeffs[j + > l]); > 1028: coeffs[j + l] = coeffs[j] - tmp; > 1029: coeffs[j] = coeffs[j] + tmp; Similar to how you land the array contents onto a couple of ints in `power2round`, I wonder if you might squeeze a few clock cycles out by putting `coeffs[j]` and maybe `coeffs[j + 1]` onto local int variables to increase register pressure. You would save potentially a couple reaches into memory if that int lived on a register. Not sure how much mileage you would get from it. src/java.base/share/classes/sun/security/provider/ML_DSA.java line 1050: > 1048: int tmp = coeffs[j]; > 1049: coeffs[j] = (tmp + coeffs[j + l]); > 1050: coeffs[j + l] = montMul(tmp - coeffs[j + l], > MONT_ZETAS_FOR_INVERSE_NTT[m]); This might be another area where you can land `coeffs[j + 1]` on a local int. Again, not sure if or how much it might help. ------------- Marked as reviewed by jnimeh (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/21364#pullrequestreview-2447311809 PR Review Comment: https://git.openjdk.org/jdk/pull/21364#discussion_r1849474054 PR Review Comment: https://git.openjdk.org/jdk/pull/21364#discussion_r1849475269