Re: RFR: 8355216: Accelerate P-256 arithmetic on aarch64 [v5]

Andrew Haley Tue, 19 May 2026 01:32:02 -0700

On Mon, 18 May 2026 14:59:38 GMT, Andrew Dinn <[email protected]> wrote:


>> Ferenc Rakoczi has updated the pull request incrementally with one 
>> additional commit since the last revision:
>> 
>>   Accepting more suggestions from Andrew Dinn.
>
> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7758:
> 
>> 7756:       __ lsr(tmp, lo, montMulP256Shift2);
>> 7757:       __ orr(hi, hi, tmp);
>> 7758:       __ andr(lo, lo, mask);
> 
> Suggestion:
> 
>       // compute 104-bit (40 + 64) full product
>       __ umulh(hi, a, b);
>       __ mul(lo, a, b);
>       // combine 40 + 12 bits into hi result
>       __ lsl(hi, hi, montMulP256Shift1);
>       __ lsr(tmp, lo, montMulP256Shift2);
>       __ orr(hi, hi, tmp);
>       // mask off 52 bits of lo result
>       __ andr(lo, lo, mask);

It might be better and clearer to use `bfm` rather that shifting, masking, and 
ORing.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/30941#discussion_r3264769223

Re: RFR: 8355216: Accelerate P-256 arithmetic on aarch64 [v5]

Reply via email to