On Wed, 5 Mar 2025 13:07:54 GMT, Ferenc Rakoczi <d...@openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_dilithium.cpp line 292: >> >>> 290: __ movl(iterations, 2); >>> 291: >>> 292: __ BIND(L_loop); >> >> Hi @ferakocz , Kindly align loop entry address using __align64() here and at >> all the places before __BIND(LOOP) > > Hi, @jatin-bhateja, thanks for the suggestion. I have added __ > align(OptoLoopAlignment); before all loop entries. Hi @ferakocz , Thanks!, for efficient utilization of Decode ICache (please refer to Intel SDM section 3.4.2.5), code blocks should be aligned to 32-byte boundaries; a 64-byte aligned code is a superset of both 16 and 32 byte aligned addresses and also matches with the cacheline size. However, I can noticed that we have been using OptoLoopAlignment at places in AES-GCM also. I introduced some errors in generate_dilithiumAlmostInverseNtt_avx512 implementation in anticipation of catching it through existing ML_DSA_Tests under test/jdk/sun/security/provider/acvp But all the tests passed for me. `java -jar /home/jatinbha/sandboxes/jtreg/build/images/jtreg/lib/jtreg.jar -jdk:$JAVA_HOME -Djdk.test.lib.artifacts.ACVP-Server=/home/jatinbha/softwares/v1.1.0.38.zip -va -timeout:4 Launcher.java` Can you please point out a test I need to use for validation ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/23860#discussion_r1981468903