On Wed, 15 Nov 2023 00:39:29 GMT, Sandhya Viswanathan <sviswanat...@openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 1186: >> >>> 1184: __ evmovntdquq(Address(dst, index, scale, offset + 0x40), xmm2, >>> Assembler::AVX_512bit); >>> 1185: __ evmovntdquq(Address(dst, index, scale, offset + 0x80), xmm3, >>> Assembler::AVX_512bit); >>> 1186: __ evmovntdquq(Address(dst, index, scale, offset + 0xC0), xmm4, >>> Assembler::AVX_512bit); >> >> These are non-temporal memory moves, to force eviction from write combining >> buffers we may need to emit additional fences, else a subsequent read from >> destination memory may see incorrect values. > > @jatin-bhateja There is a sfence at line 781. Thanks, there is an store fence upon completion of the main loop for the large size code:  ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16575#discussion_r1393511087