On Wed, 18 Dec 2024 15:09:56 GMT, Robert Toyonaga <d...@openjdk.org> wrote:
>> This is a redo of [JDK-8304824](https://bugs.openjdk.org/browse/JDK-8304824) >> which was backed out by >> [JDK-8343726](https://bugs.openjdk.org/browse/JDK-8343726) due to problems >> documented in [JDK-8343244](https://bugs.openjdk.org/browse/JDK-8343244). >> >> The problem was that `NmtVirtualMemoryLocker` was not locking when the >> current thread is not attached by checking `Thread::current_or_null_safe() >> != nullptr`. This is necessary during VM init, but should not be allowed >> afterward. NMT may be used in `attach_current_thread` before the current >> thread is set. The lock was not being acquired in that case, which >> intermittently caused NMT accounting to become corrupted, triggering various >> assertions when future NMT operations are done. To fix this, I've adopted >> [Thomas' >> suggestion](https://github.com/openjdk/jdk/pull/21928#issuecomment-2460238057) >> to reverse the order of >> >> >> thread->register_thread_stack_with_NMT(); >> thread->initialize_thread_current(); >> >> >> in `attach_current_thread`. This allows `NmtVirtualMemoryLocker` to be >> locked after current thread is set. >> >> To allow for `NmtVirtualMemoryLocker` to be used during VM init, I've >> replaced the `ConditionalMutexLocker` check `Thread::current_or_null_safe() >> != nullptr` with a new flag: `_done_bootstrap`. This flag prevents the lock >> from being used during VM init, at which point we are single threaded >> anyway. This avoids errors due to Hotspot mutexes and current thread not yet >> being ready. >> >> I also added new asserts in `virtualMemoryTracker.cpp` to catch future bugs >> like this where the lock is not held when it should be. I updated the >> appropriate VMT tests to also lock (there were a few cases where locking was >> being bypassed) so they can pass the new asserts. >> >> I also removed the unused `_query_lock` variable in `MemTracker`. >> >> Testing: >> >> - On Linux amd64, I was able to consistently reproduce the errors described >> in [JDK-8343244](https://bugs.openjdk.org/browse/JDK-8343244) by increasing >> the number of test threads in >> `java/lang/Thread/jni/AttachCurrentThread/AttachTest.java`. The test >> consistently passes with the new changes in this PR. >> - hotspot_nmt , gtest:VirtualSpace, tier1 > > Robert Toyonaga has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains seven commits: > > - Merge master > - move MemTracker::set_done_bootstrap() > - Update src/hotspot/share/utilities/vmError.cpp > > Co-authored-by: David Holmes > <62092539+dholmes-...@users.noreply.github.com> > - updates tests and remove old class > - use ConditionalMutexLocker directly > - Fix concurrency issue. Add assertions. Update tests. > - Revert "8343726: [BACKOUT] NMT should not use ThreadCritical" > > This reverts commit c3df050b88ecef123199a4e96f6d9884d064ae45. Sorry if I'm repeating other comments, but it looks like a good change but could be a bit less repetition. Thanks for removing this instance of ThreadCritical. Oh yes, now I see, sorry I did repeat Kim's suggestion. src/hotspot/share/runtime/mutexLocker.cpp line 292: > 290: MUTEX_DEFN(NMTQuery_lock , PaddedMutex , safepoint); > 291: MUTEX_DEFN(NMTCompilationCostHistory_lock , PaddedMutex , > nosafepoint); > 292: MUTEX_DEFN(NmtVirtualMemory_lock , PaddedMutex , service-4); Why is this service-4? Does it depend on being a rank lower than another lock? If so, this and the SharedDecoder_lock should be declared below as MUTEX_DEFL and call out that lock. src/hotspot/share/runtime/os.cpp line 2313: > 2311: bool result; > 2312: if (MemTracker::enabled()) { > 2313: ConditionalMutexLocker cml(NmtVirtualMemory_lock, > MemTracker::is_done_bootstrap(), Mutex::_no_safepoint_check_flag); This pattern is all over. Can you create a class in nmtSomeHeader like: class NMTLocker { ConditionalMutexLocker cml; NMTLocker(); }; in cpp file: NmtLocker() : _cml(NmtVirtualMemory_lock, MemTracker::is_done_bootstrap(), Mutex::_no_safepoint_check_flag) {} Or something like that, rather than this be repeated everywhere. I only skimmed the other comments, so maybe David and Kim said the same thing. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/22745#pullrequestreview-2513014280 PR Comment: https://git.openjdk.org/jdk/pull/22745#issuecomment-2552430048 PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1890930186 PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1890932610