On Sat, 14 Sep 2024 06:56:01 GMT, Thomas Stuefe <stu...@openjdk.org> wrote:

>> Hi @tstuefe, it looks like calling 
>> [`os::print_memory_mappings`](https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L3785)
>>  from the windows implementation of `os::pd_release_memory` is causing the 
>> rank conflict between `tty_lock` and `NMT_lock`.  This is getting called 
>> from `os::release_memory` [**after** we've already acquired the lock for 
>> NMT](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/os.cpp#L2185-L2189).
>> 
>> gtest `release_bad_ranges` --> `os::release_memory` and acquire `NMT_lock` 
>> --> `os::pd_release_memory` --> `os::print_memory_mappings` and acquire 
>> `tty_lock`
>> 
>> Is there a reason we cast a wider net and lock from outside of NMT code in 
>> `os::release_memory`, `os::uncommit_memory`,  and `os::unmap_memory`?  By 
>> contrast, in other functions such as `os::commit_memory`  and 
>> `os::reserve_memory` we lock inside of `MemTracker` instead. 
>> For example, `pd_release_memory` is protected by the NMT lock, but 
>> `pd_reserve_memory` is not. Is the lock meant to synchronize the actual 
>> memory operations as well as NMT recording in some cases? If not, then maybe 
>> it makes sense to  delay and move the lock acquisition inside `MemTracker` 
>> for consistency and to reduce surface area for lock conflicts.
>
> Hi @roberttoyonaga 
> 
> (Pinging @afshin-zafari and @jdksjolen)
> 
> First off, it is good that you ran into this since it highlights a possible 
> real deadlock. Both locks are quite coarse. I cannot think from the top of my 
> head of a scenario where we hold the tty lock and need NMT locking, but would 
> not bet money on it. E.g. I bet we call malloc under ttylock (if only as a 
> side effect of calling functions that return resource area). Malloc does not 
> need locking; but its brittle and small changes can cause deadlocks (e.g. 
> folks were thinking about moving ResourceArea memory to mmap, which needs NMT 
> locks for tracking).
> 
>> Hi @tstuefe, it looks like calling 
>> [`os::print_memory_mappings`](https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L3785)
>>  from the windows implementation of `os::pd_release_memory` is causing the 
>> rank conflict between `tty_lock` and `NMT_lock`. This is getting called from 
>> `os::release_memory` [**after** we've already acquired the lock for 
>> NMT](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/os.cpp#L2185-L2189).
>> 
>> gtest `release_bad_ranges` --> `os::release_memory` and acquire `NMT_lock` 
>> --> `os::pd_release_memory` --> `os::print_memory_mappings` and acquire 
>> `tty_lock`
>> 
> 
> Ah thank you. I think I added that printing ages ago. I should not have used 
> tty. The immediate solution to get you going may be to just change this from 
> tty to `fileStream fs(stderr);`
> 
>> Is there a reason we cast a wider net and lock from outside of NMT code in 
>> `os::release_memory`, `os::uncommit_memory`, and `os::unmap_memory`? By 
>> contrast, in other functions such as `os::commit_memory` and 
>> `os::reserve_memory` we lock inside of `MemTracker` instead. For example, 
>> `pd_release_memory` is protected by the NMT lock, but `pd_reserve_memory` is 
>> not. Is the lock meant to synchronize the actual memory operations as well 
>> as NMT recording in some cases? If not, then maybe it makes sense to delay 
>> and move the lock acquisition inside `MemTracker` for consistency and to 
>> reduce surface area for lock conflicts.
> 
> The problem is that both tty_lock and the NMT lock are quite coarse. I always 
> maintained that tty_lock is too coarse - we have many places where there is a 
> ttyLocker somewhere at the start of a printing function, then the function 
> goes and prints tons of infos it *retrieves under lock protection*. The 
> better way to solve this is to assemble the text outside of lock protection, 
> then only do the printing under lock protection. We ...

> Hi @tstuefe,
> 
> > Ah thank you. I think I added that printing ages ago. I should not have 
> > used tty. The immediate solution to get you going may be to just change 
> > this from tty to fileStream fs(stderr);
> 
> Thanks! I adopted your suggestion and replaced tty. The lock conflict is gone 
> now.
> 
> > Ensuring that we feed NMT the tracking information in the order they 
> > happened. This is only relevant for mmap accounting, since only there do we 
> > need to track address ranges. For malloc accounting, we just 
> > increase/decrease counters, and there, the order of operations does not 
> > matter, just that we do it atomically.
> > ...
> > A pragmatic solution would be to move the locking down the stack. 
> > Currently, we lock in the generic layer around { pd_release_memory() + 
> > MemTracker::record }. But we only need to lock around the OS allocation 
> > function (mmap/munmap/VirtualAlloc etc). So we could move the locks 
> > downward to the OS specific layer where the actual allocations happen in 
> > the OS code, e.g. just lock around { munmap() + MemTracker::record }.
> 
> Ok I see. So the locking should indeed not happen inside of `MemTracker`. 
> Yes, maybe moving the locking down the stack into platform specific code 
> would be a good task for the future. However, if the intention is to feed NMT 
> the tracking info in the order it happened, why are the operations on memory 
> in `os::commit_memory` and `os::reserve_memory` not protected by the lock? In 
> those, we delay and lock inside `MemTracker` instead.

That looks like an error that should be fixed.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20852#issuecomment-2357563280

Reply via email to