On Mon, 17 Mar 2025 16:20:42 GMT, Robert Toyonaga <d...@openjdk.org> wrote:

> ### Summary:
> This PR makes memory operations atomic with NMT accounting.
> 
> ### The problem:
> In memory related functions like `os::commit_memory` and `os::reserve_memory` 
> the OS memory operations are currently done before acquiring the the NMT 
> mutex. And the the virtual memory accounting is done later in `MemTracker`, 
> after the lock has been acquired. Doing the memory operations outside of the 
> lock scope can lead to races. 
> 
> 1.1 Thread_1 releases range_A.
> 1.2 Thread_1 tells NMT "range_A has been released".
> 
> 2.1 Thread_2 reserves (the now free) range_A.
> 2.2 Thread_2 tells NMT "range_A is reserved".
> 
> Since the sequence (1.1)  (1.2) is not atomic, if Thread_2 begins operating 
> after (1.1), we can have (1.1) (2.1) (2.2) (1.2). The OS sees two valid 
> subsequent calls (release range_A, followed by map range_A). But NMT sees 
> "reserve range_A", "release range_A" and is now out of sync with the OS.
> 
> ### Solution:
> Where memory operations such as reserve, commit, or release virtual memory 
> happen,  I've expanded the scope of `NmtVirtualMemoryLocker` to protect both 
> the NMT accounting and the memory operation itself. 
> 
> ### Other notes:
> I also simplified this  pattern found in many places:
> 
> if (MemTracker::enabled()) {
>   MemTracker::NmtVirtualMemoryLocker nvml;
>   result = pd_some_operation(addr, bytes);
>   if (result != nullptr) {
>     MemTracker::record_some_operation(addr, bytes);
>   }
> } else {
>   result = pd_unmap_memory(addr, bytes);
> }
> ``` 
> To:
> 
> MemTracker::NmtVirtualMemoryLocker nvml;
> result = pd_unmap_memory(addr, bytes);
> MemTracker::record_some_operation(addr, bytes);
> ``` 
> This is possible because `NmtVirtualMemoryLocker` now checks 
> `MemTracker::enabled()`.  `MemTracker::record_some_operation` already checks 
> `MemTracker::enabled()` and checks against nullptr.  This refactoring 
> previously wasn't possible because `ThreadCritical` was used before 
> https://github.com/openjdk/jdk/pull/22745 introduced 
> `NmtVirtualMemoryLocker`. 
> 
> I considered moving the locking and NMT accounting down into platform 
> specific code:  Ex. lock around { munmap() + MemTracker::record }. The hope 
> was that this would help reduce the size of the critical section. However, I 
> found that the OS-specific "pd_" functions are already short and 
> to-the-point, so doing this wasn't reducing the lock scope very much. Instead 
> it just makes the code more messy by having to maintain the locking and NMT 
> accounting in each platform specific implementation. 
> 
> In many places I've done minor refactoring by relocating call...

OK should I update this PR to do the following things:
-  Add comments explaining the asymmetrical locking and warning against 
patterns that lead to races
- swapping the order of `NmtVirtualMemoryLocker` and release/uncommit
- Fail fatally if release/uncommit does not complete.

Or does it make more sense to do that in a different issue/PR?

Also, do we want to keep the new tests and the refactorings (see below)?

if (MemTracker::enabled()) {
  MemTracker::NmtVirtualMemoryLocker nvml;
  result = pd_some_operation(addr, bytes);
  if (result != nullptr) {
    MemTracker::record_some_operation(addr, bytes);
  }
} else {
  result = pd_unmap_memory(addr, bytes);
}

To:

MemTracker::NmtVirtualMemoryLocker nvml;
result = pd_unmap_memory(addr, bytes);
MemTracker::record_some_operation(addr, bytes);

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2766365411

Reply via email to