On Wed, 14 May 2025 11:41:30 GMT, Ferenc Rakoczi <d...@openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 696: >> >>> 694: address generate_kyberAddPoly_2_avx512(StubGenerator *stubgen, >>> 695: MacroAssembler *_masm) { >>> 696: >> >> The Java code for "implKyberAddPoly(short[] result, short[] a, short[] b)" >> does BarrettReduction but the intrinsic code here does not. Is that >> intentional and how is the reduction handled? > > Actually, the Java version is the one that is too cautious. There is Barrett > reduction after at most 4 consecutive uses of mlKemAddPoly(), so doing the > reduction in implKyberAddPoly() is not necessary. Thanks for discovering this! Thanks. I have another question, is there a reason that the Java versions of AddPoly (both for 2 and 3 input) return 1, whereas the corresponding intrinsics return 0? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2089278218