On Wed, 5 Mar 2025 14:03:00 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:

>> Hi, @jatin-bhateja, thanks for the suggestion. I have added __ 
>> align(OptoLoopAlignment); before all loop entries.
>
> Hi @ferakocz , 
> 
> Thanks!, for efficient utilization of Decode ICache (please refer to Intel 
> SDM section 3.4.2.5), code blocks should be aligned to 32-byte boundaries; a 
> 64-byte aligned code is a superset of both 16 and 32 byte aligned addresses 
> and also matches with the cacheline size. However, I can noticed that we have 
> been using OptoLoopAlignment at places in AES-GCM also.
> 
> I introduced some errors in generate_dilithiumAlmostInverseNtt_avx512 
> implementation in anticipation of catching it through existing ML_DSA_Tests 
> under 
> test/jdk/sun/security/provider/acvp
> 
> But all the tests passed for me.
> `java  -jar /home/jatinbha/sandboxes/jtreg/build/images/jtreg/lib/jtreg.jar 
> -jdk:$JAVA_HOME 
> -Djdk.test.lib.artifacts.ACVP-Server=/home/jatinbha/softwares/v1.1.0.38.zip 
> -va -timeout:4 Launcher.java`
> 
> Can you please point out a test I need to use for validation

I think the easiest is to put a for (int i = 0; i < 1000; i++) loop around  the 
switch statement in the run() method of  the ML_DSA_Test class 
(test/jdk/sun/security/provider/acvp/ML_DSA_Test.java). (This is because the 
intrinsics kick in after a few thousand calls of the method.)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23860#discussion_r1981945490

Reply via email to