On Thu, 3 Apr 2025 15:27:15 GMT, Robert Toyonaga <d...@openjdk.org> wrote:

>> ### Update:
>> After some discussion it was decided it's not necessary to expand the lock 
>> scope for reserve/commit. Instead, we are opting to add comments explaining 
>> the reasons for locking and the conditions to avoid which could lead to 
>> races.  Some of the new tests can be kept because they are general enough to 
>> be useful outside of this context.
>> 
>> ### Summary:
>> This PR makes memory operations atomic with NMT accounting.
>> 
>> ### The problem:
>> In memory related functions like `os::commit_memory` and 
>> `os::reserve_memory` the OS memory operations are currently done before 
>> acquiring the the NMT mutex. And the the virtual memory accounting is done 
>> later in `MemTracker`, after the lock has been acquired. Doing the memory 
>> operations outside of the lock scope can lead to races. 
>> 
>> 1.1 Thread_1 releases range_A.
>> 1.2 Thread_1 tells NMT "range_A has been released".
>> 
>> 2.1 Thread_2 reserves (the now free) range_A.
>> 2.2 Thread_2 tells NMT "range_A is reserved".
>> 
>> Since the sequence (1.1)  (1.2) is not atomic, if Thread_2 begins operating 
>> after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid 
>> subsequent calls (release range_A, followed by map range_A). But NMT sees 
>> "reserve range_A", "release range_A" and is now out of sync with the OS.
>> 
>> ### Solution:
>> Where memory operations such as reserve, commit, or release virtual memory 
>> happen,  I've expanded the scope of `NmtVirtualMemoryLocker` to protect both 
>> the NMT accounting and the memory operation itself. 
>> 
>> ### Other notes:
>> I also simplified this  pattern found in many places:
>> 
>> if (MemTracker::enabled()) {
>>   MemTracker::NmtVirtualMemoryLocker nvml;
>>   result = pd_some_operation(addr, bytes);
>>   if (result != nullptr) {
>>     MemTracker::record_some_operation(addr, bytes);
>>   }
>> } else {
>>   result = pd_unmap_memory(addr, bytes);
>> }
>> ``` 
>> To:
>> 
>> MemTracker::NmtVirtualMemoryLocker nvml;
>> result = pd_unmap_memory(addr, bytes);
>> MemTracker::record_some_operation(addr, bytes);
>> ``` 
>> This is possible because `NmtVirtualMemoryLocker` now checks 
>> `MemTracker::enabled()`.  `MemTracker::record_some_operation` already checks 
>> `MemTracker::enabled()` and checks against nullptr.  This refactoring 
>> previously wasn't possible because `ThreadCritical` was used before 
>> https://github.com/openjdk/jdk/pull/22745 introduced 
>> `NmtVirtualMemoryLocker`. 
>> 
>> I considered moving the locking and NMT accounting down into platform 
>> specific code:  Ex. lock around { munmap() + MemTracker:...
>
> Robert Toyonaga has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   exclude file mapping tests on AIX.

I think this looks good to me, but please seek feedback from others as well.

I've added a couple of suggestions. None of them are required, but I think they 
would be nice to do.

src/hotspot/share/runtime/os.cpp line 2206:

> 2204: // when it is actually committed. The opposite scenario is not guarded 
> against. pd_commit_memory and
> 2205: // record_virtual_memory_commit do not happen atomically. We assume 
> that there is some external synchronization
> 2206: // that prevents a region from being uncommitted before it is finished 
> being committed.

It's not a requirement, but you get kudos from me if you keep comments lines 
below 80 lines. I typically don't like code to be 80 lines, but comments tend 
to be nicer if they are.

test/hotspot/gtest/runtime/test_os.cpp line 1123:

> 1121: 
> 1122:   char* base = os::reserve_memory(size, false, mtTest);
> 1123:   ASSERT_NE(base, (char*) nullptr);

Suggestion:

  ASSERT_NOT_NULL(base);


And the same in other places.

test/hotspot/gtest/runtime/test_os.cpp line 1133:

> 1131: }
> 1132: 
> 1133: #if !defined(_AIX)

Suggestion:

#if !defined(_AIX)


I suggest a blank line here because this ifdef spans multiple tests and not 
only the nearest test. Having a blank line makes it clearer that this is a 
large ifdef that is not only related to the test case that it is bunched up 
against.

test/hotspot/gtest/runtime/test_os.cpp line 1145:

> 1143:   EXPECT_TRUE(result != nullptr);
> 1144: 
> 1145:   EXPECT_TRUE(strcmp(letters, result)==0);

Suggestion:

  EXPECT_TRUE(strcmp(letters, result) == 0);

but probably even better:
Suggestion:

  EXPECT_EQ(strcmp(letters, result), 0);

test/hotspot/gtest/runtime/test_os.cpp line 1184:

> 1182:   ::close(fd);
> 1183: }
> 1184: #endif

Suggestion:


#endif // !defined(_AIX)

I suggest a blank line and a matching comment. I know some HotSpots devs tend 
to appreciate those comments.

-------------

Marked as reviewed by stefank (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/24084#pullrequestreview-2750137709
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033287481
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033292443
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033303666
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033294266
PR Review Comment: https://git.openjdk.org/jdk/pull/24084#discussion_r2033307030

Reply via email to