On Mon, 17 Jun 2024 21:21:01 GMT, Vladimir Kozlov <k...@openjdk.org> wrote:

> Let me know that I got it right:
> 
> * The reduction operation was optional and P256 benefitted by not executing 
> it.
> * Previous `mult()` **Java** code always retuned 0 because it executes 
> reduction so callers do not need to do it.
> * `_intpoly_montgomeryMult_P256` intrinsic code executes only part of code 
> from previous `mult()` and it returns 1 to indicate that reduction should be 
> executed if needed.
> * Now `mult()` is split into 2 methods (with `multImpl()` intrinisfied) and 
> always executes reduction so it can return 0.

Thats it exactly. Except I would correct the last two words `return 0`. It is 
now void so no return (and I imagine that is why XDH did not like it; having it 
hardcoded to 0, without having to do inlining, opens the doors for some more 
optimizations. Also, the code I had checked in as part of montgomery PR was 
returning 0 everywhere but the intrinsic.

> I like new implementation because intrinsic matches Java code. It would allow 
> avoid confusion I had.

I disliked this too. I originally removed the Java reduction too, but it hurt 
the non-intrinsic performance, so put it back in. (Before I got distracted with 
this bug, I was actually working on next ECC iteration, and was trying to fix 
this mismatch. But I also hadn't realized how much this optimization actually 
helped.)

There is also a 'bigger' complaint.. this optimization tried to use virtual 
methods to specialize one particular curve. Fairly standard practice. And it 
brought the other 'unaffected' curve down. If I can't use virtual methods for 
further optimizations.. how am I supposed to optimize further? Hmm. Not the 
time to discuss an answer, this release is going out, not the time to get 
'creative', but this will give me problems next time I try to add code here.
 
> The only question left: do we need to do something about Java code which 
> checks return value? It is always 0 now. And I don't see you changed such 
> checks.

(Correction: no return, void). numAdds is now again pretty much a 'private' 
concept to the IntegerPolynomial class, so figure it was fine before, it should 
be fine now?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19728#issuecomment-2174493638

Reply via email to