Re: RFR: 8304824: NMT should not use ThreadCritical [v3]

Thomas Stuefe Sat, 14 Sep 2024 00:01:29 -0700

On Sat, 14 Sep 2024 02:35:37 GMT, Robert Toyonaga <d...@openjdk.org> wrote:


>>> > After switching to a Hotspot Mutex, it looks like the `windows-x64 / test 
>>> > (hs/tier1 common) GHA` is failing because the test `release_bad_ranges` 
>>> > in test_os.cpp is expecting an assertion and an error message to be 
>>> > printed. However, when this printing happens, `tty_lock` gets acquired 
>>> > out of rank order with the already held `NMT_lock`, causing the lock rank 
>>> > assertion fail. One solution would be to lower the rank of `tty_lock`. 
>>> > I'm not sure that's the best solution because that might cause rank 
>>> > conflicts with other locks (and it makes sense to give locks the highest 
>>> > possible rank to minimize future conflicts).
>>> 
>>> What code exactly locks tty_lock?
>> 
>> This is weird. We print the error in this case via 
>> print_error_for_unit_test(), which takes care only to use raw 
>> fprintf(stderr) for this very reason... therefore I am interested in 
>> understanding the lock semantics, to know if deadlock detection found a real 
>> problem or if this is just a weird error setup.
>
> Hi @tstuefe, it looks like calling 
> [`os::print_memory_mappings`](https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L3785)
>  from the windows implementation of `os::pd_release_memory` is causing the 
> rank conflict between `tty_lock` and `NMT_lock`.  This is getting called from 
> `os::release_memory` [**after** we've already acquired the lock for 
> NMT](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/os.cpp#L2185-L2189).
> 
> gtest `release_bad_ranges` --> `os::release_memory` and acquire `NMT_lock` 
> --> `os::pd_release_memory` --> `os::print_memory_mappings` and acquire 
> `tty_lock`
> 
> Is there a reason we cast a wider net and lock from outside of NMT code in 
> `os::release_memory`, `os::uncommit_memory`,  and `os::unmap_memory`?  By 
> contrast, in other functions such as `os::commit_memory`  and 
> `os::reserve_memory` we lock inside of `MemTracker` instead. 
> For example, `pd_release_memory` is protected by the NMT lock, but 
> `pd_reserve_memory` is not. Is the lock meant to synchronize the actual 
> memory operations as well as NMT recording in some cases? If not, then maybe 
> it makes sense to  delay and move the lock acquisition inside `MemTracker` 
> for consistency and to reduce surface area for lock conflicts.

Hi @roberttoyonaga 

(Pinging @afshin-zafari and @jdksjolen)

First off, it is good that you ran into this since it highlights a possible 
real deadlock. Both locks are quite coarse. I cannot think from the top of my 
head of a scenario where we hold the tty lock and need NMT locking, but would 
not bet money on it. E.g. I bet we call malloc under ttylock (if only as a side 
effect of calling functions that return resource area). Malloc does not need 
locking; but its brittle and small changes can cause deadlocks (e.g. folks were 
thinking about moving ResourceArea memory to mmap, which needs NMT locks for 
tracking).

> Hi @tstuefe, it looks like calling 
> [`os::print_memory_mappings`](https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L3785)
>  from the windows implementation of `os::pd_release_memory` is causing the 
> rank conflict between `tty_lock` and `NMT_lock`. This is getting called from 
> `os::release_memory` [**after** we've already acquired the lock for 
> NMT](https://github.com/openjdk/jdk/blob/master/src/hotspot/share/runtime/os.cpp#L2185-L2189).
> 
> gtest `release_bad_ranges` --> `os::release_memory` and acquire `NMT_lock` 
> --> `os::pd_release_memory` --> `os::print_memory_mappings` and acquire 
> `tty_lock`
> 

Ah thank you. I think I added that printing ages ago. I should not have used 
tty. The immediate solution to get you going may be to just change this from 
tty to `fileStream fs(stderr);`

> Is there a reason we cast a wider net and lock from outside of NMT code in 
> `os::release_memory`, `os::uncommit_memory`, and `os::unmap_memory`? By 
> contrast, in other functions such as `os::commit_memory` and 
> `os::reserve_memory` we lock inside of `MemTracker` instead. For example, 
> `pd_release_memory` is protected by the NMT lock, but `pd_reserve_memory` is 
> not. Is the lock meant to synchronize the actual memory operations as well as 
> NMT recording in some cases? If not, then maybe it makes sense to delay and 
> move the lock acquisition inside `MemTracker` for consistency and to reduce 
> surface area for lock conflicts.

The problem is that both tty_lock and the NMT lock are quite coarse. I always 
maintained that tty_lock is too coarse - we have many places where there is a 
ttyLocker somewhere at the start of a printing function, then the function goes 
and prints tons of infos it *retrieves under lock protection*. The better way 
to solve this is to assemble the text outside of lock protection, then only do 
the printing under lock protection. We do this In UL: we have a LogMessage 
class, which we use for multi-line output that must not tear.

The NMT lock is coarse too, but that is by design. It has two purposes:
1) Guarding internal global data structures, e.g. the MallocSiteTable or the 
list of virtual memory regions
2) Ensuring that we feed NMT the tracking information *in the order they 
happened*. This is only relevant for mmap accounting, since only there do we 
need to track address ranges. For malloc accounting, we just increase/decrease 
counters, and there, the order of operations does not matter, just that we do 
it atomically.

(2) is the crux here. If it were only for (1), we could keep the 
synchronization blocks really small, e.g. just guard modification of the 
MallocSiteTable. 

A pragmatic solution would be to move the locking down the stack. Currently, we 
lock in the generic layer around { pd_release_memory() + MemTracker::record }. 
But we only need to lock around the *OS allocation function* 
(mmap/munmap/VirtualAlloc etc). So we could move the locks downward to the OS 
specific layer where the actual allocations happen in the OS code, e.g. just 
lock around  { munmap() + MemTracker::record }.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20852#issuecomment-2350882050

Re: RFR: 8304824: NMT should not use ThreadCritical [v3]

Reply via email to