On Thu, 3 Apr 2025 15:27:15 GMT, Robert Toyonaga <[email protected]> wrote:
>> ### Update:
>> After some discussion it was decided it's not necessary to expand the lock
>> scope for reserve/commit. Instead, we are opting to add comments explaining
>> the reasons for locking and the conditions to avoid which could lead to
>> races. Some of the new tests can be kept because they are general enough to
>> be useful outside of this context.
>>
>> ### Summary:
>> This PR makes memory operations atomic with NMT accounting.
>>
>> ### The problem:
>> In memory related functions like `os::commit_memory` and
>> `os::reserve_memory` the OS memory operations are currently done before
>> acquiring the the NMT mutex. And the the virtual memory accounting is done
>> later in `MemTracker`, after the lock has been acquired. Doing the memory
>> operations outside of the lock scope can lead to races.
>>
>> 1.1 Thread_1 releases range_A.
>> 1.2 Thread_1 tells NMT "range_A has been released".
>>
>> 2.1 Thread_2 reserves (the now free) range_A.
>> 2.2 Thread_2 tells NMT "range_A is reserved".
>>
>> Since the sequence (1.1) (1.2) is not atomic, if Thread_2 begins operating
>> after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid
>> subsequent calls (release range_A, followed by map range_A). But NMT sees
>> "reserve range_A", "release range_A" and is now out of sync with the OS.
>>
>> ### Solution:
>> Where memory operations such as reserve, commit, or release virtual memory
>> happen, I've expanded the scope of `NmtVirtualMemoryLocker` to protect both
>> the NMT accounting and the memory operation itself.
>>
>> ### Other notes:
>> I also simplified this pattern found in many places:
>>
>> if (MemTracker::enabled()) {
>> MemTracker::NmtVirtualMemoryLocker nvml;
>> result = pd_some_operation(addr, bytes);
>> if (result != nullptr) {
>> MemTracker::record_some_operation(addr, bytes);
>> }
>> } else {
>> result = pd_unmap_memory(addr, bytes);
>> }
>> ```
>> To:
>>
>> MemTracker::NmtVirtualMemoryLocker nvml;
>> result = pd_unmap_memory(addr, bytes);
>> MemTracker::record_some_operation(addr, bytes);
>> ```
>> This is possible because `NmtVirtualMemoryLocker` now checks
>> `MemTracker::enabled()`. `MemTracker::record_some_operation` already checks
>> `MemTracker::enabled()` and checks against nullptr. This refactoring
>> previously wasn't possible because `ThreadCritical` was used before
>> https://github.com/openjdk/jdk/pull/22745 introduced
>> `NmtVirtualMemoryLocker`.
>>
>> I considered moving the locking and NMT accounting down into platform
>> specific code: Ex. lock around { munmap() + MemTracker:...
>
> Robert Toyonaga has updated the pull request incrementally with one
> additional commit since the last revision:
>
> exclude file mapping tests on AIX.
I think this looks good to me, but please seek feedback from others as well.
I've added a couple of suggestions. None of them are required, but I think they
would be nice to do.
src/hotspot/share/runtime/os.cpp line 2206:
> 2204: // when it is actually committed. The opposite scenario is not guarded
> against. pd_commit_memory and
> 2205: // record_virtual_memory_commit do not happen atomically. We assume
> that there is some external synchronization
> 2206: // that prevents a region from being uncommitted before it is finished
> being committed.
It's not a requirement, but you get kudos from me if you keep comments lines
below 80 lines. I typically don't like code to be 80 lines, but comments tend
to be nicer if they are.
test/hotspot/gtest/runtime/test_os.cpp line 1123:
> 1121:
> 1122: char* base = os::reserve_memory(size, false, mtTest);
> 1123: ASSERT_NE(base, (char*) nullptr);
Suggestion:
ASSERT_NOT_NULL(base);
And the same in other places.
test/hotspot/gtest/runtime/test_os.cpp line 1133:
> 1131: }
> 1132:
> 1133: #if !defined(_AIX)
Suggestion:
#if !defined(_AIX)
I suggest a blank line here because this ifdef spans multiple tests and not
only the nearest test. Having a blank line makes it clearer that this is a
large ifdef that is not only related to the test case that it is bunched up
against.
test/hotspot/gtest/runtime/test_os.cpp line 1145:
> 1143: EXPECT_TRUE(result != nullptr);
> 1144:
> 1145: EXPECT_TRUE(strcmp(letters, result)==0);
Suggestion:
EXPECT_TRUE(strcmp(letters, result) == 0);
but probably even better:
Suggestion:
EXPECT_EQ(strcmp(letters, result), 0);
test/hotspot/gtest/runtime/test_os.cpp line 1184:
> 1182: ::close(fd);
> 1183: }
> 1184: #endif
Suggestion:
#endif // !defined(_AIX)
I suggest a blank line and a matching comment. I know some HotSpots devs tend
to appreciate those comments.
-------------
Marked as reviewed by stefank (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/24084#pullrequestreview-2750137709
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033287481
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033292443
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033303666
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033294266
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033307030