On Mon, 22 Jan 2024 07:08:31 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 505: >> >>> 503: __ cmpb(Address(rbx, r15, Address::times_1, -0xa), rax); >>> 504: __ jne(L_top_loop_1); >>> 505: __ jmp(L_0x406019); >> >> Instead of having special handling for each tail size (3 - 31 bytes), can we >> directly use 32 bytes VMASKMOVPS with appropriate mask for different tail >> sizes and only residual part (0 - 3 bytes) can fall over to scalar tail. > > Basically tail size can be rounded to nearest multiple of doubleword. I have since changed the algorithm due to request from @sviswa7 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1610120366