On Tue, 16 Apr 2024 00:04:15 GMT, Scott Gibbons <sgibb...@openjdk.org> wrote:
>> This code makes an intrinsic stub for `Unsafe::setMemory` for x86_64. See >> [this PR](https://github.com/openjdk/jdk/pull/16760) for discussion around >> this change. >> >> Overall, making this an intrinsic improves overall performance of >> `Unsafe::setMemory` by up to 4x for all buffer sizes. >> >> Tested with tier-1 (and full CI). I've added a table of the before and >> after numbers for the JMH I ran (`MemorySegmentZeroUnsafe`). >> >> [setMemoryBM.txt](https://github.com/openjdk/jdk/files/14808974/setMemoryBM.txt) > > Scott Gibbons has updated the pull request incrementally with one additional > commit since the last revision: > > Add enter() and leave(); remove Windows-specific register stuff src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 4013: > 4011: // Initialize table for unsafe copy memeory check. > 4012: if (UnsafeMemoryAccess::_table == nullptr) { > 4013: UnsafeMemoryAccess::create_table(26); How did you arrive at a table size of 26? src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2603: > 2601: const Register wide_value = rax; > 2602: const Register rScratch1 = r10; > 2603: Maybe put an `assert_different_registers` here for the above registers, just to be sure. (I see you are avoiding the existing `rscratch1` already, because of a conflict with `c_rarg2`) src/hotspot/cpu/x86/stubGenerator_x86_64_arraycopy.cpp line 2674: > 2672: // Parameter order is (ptr, byteVal, size) > 2673: __ xchgq(c_rarg1, c_rarg2); > 2674: __ pop(rbp); // Clear effect of enter() Why not just use `leave()` here? src/hotspot/share/opto/library_call.cpp line 4959: > 4957: if (stopped()) return true; > 4958: > 4959: if (StubRoutines::unsafe_setmemory() == nullptr) return false; I don't see why this check is needed here, since we already check whether the stub is there in `is_intrinsic_supported`. Note that `inline_unsafe_copyMemory` also doesn't have this check. I think it would be good to keep consistency between the two. src/hotspot/share/opto/runtime.cpp line 780: > 778: const TypeFunc* OptoRuntime::make_setmemory_Type() { > 779: // create input type (domain) > 780: int num_args = 4; This variable seems redundant. src/hotspot/share/opto/runtime.cpp line 786: > 784: fields[argp++] = TypePtr::NOTNULL; // dest > 785: fields[argp++] = TypeLong::LONG; // size > 786: fields[argp++] = Type::HALF; // size Since the size is a `size_t`, I don't think this is correct on 32-bit platforms. I think here we want `TypeX_X`, and then add the extra `HALF` only on 64-bit platforms. Similar to what we do in `make_arraycopy_Type`: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/runtime.cpp#L799-L802 (Note that you will also have to adjust `argcnt` for this) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572570842 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572578437 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572593795 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572556648 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572564382 PR Review Comment: https://git.openjdk.org/jdk/pull/18555#discussion_r1572562058