On Thu, 19 Dec 2024 21:04:51 GMT, Robert Toyonaga <d...@openjdk.org> wrote:

>> This is a redo of [JDK-8304824](https://bugs.openjdk.org/browse/JDK-8304824) 
>> which was backed out by 
>> [JDK-8343726](https://bugs.openjdk.org/browse/JDK-8343726) due to problems 
>> documented in [JDK-8343244](https://bugs.openjdk.org/browse/JDK-8343244).
>> 
>> The problem was that `NmtVirtualMemoryLocker` was not locking when the 
>> current thread is not attached by checking `Thread::current_or_null_safe() 
>> != nullptr`. This is necessary during VM init, but should not be allowed 
>> afterward. NMT may be used in `attach_current_thread` before the current 
>> thread is set. The lock was not being acquired in that case, which 
>> intermittently caused NMT accounting to become corrupted, triggering various 
>> assertions when future NMT operations are done.  To fix this, I've adopted 
>> [Thomas' 
>> suggestion](https://github.com/openjdk/jdk/pull/21928#issuecomment-2460238057)
>>  to reverse the order of 
>> 
>> 
>> thread->register_thread_stack_with_NMT();
>> thread->initialize_thread_current();
>> 
>> 
>> in `attach_current_thread`.  This allows `NmtVirtualMemoryLocker` to be 
>> locked after current thread is set. 
>> 
>> To allow for `NmtVirtualMemoryLocker` to be used during VM init, I've 
>> replaced the `ConditionalMutexLocker` check `Thread::current_or_null_safe() 
>> != nullptr` with a new flag: `_done_bootstrap`. This flag prevents the lock 
>> from being used during VM init, at which point we are single threaded 
>> anyway. This avoids errors due to Hotspot mutexes and current thread not yet 
>> being ready. 
>> 
>> I also added new asserts in `virtualMemoryTracker.cpp` to catch future bugs 
>> like this where the lock is not held when it should be. I updated the 
>> appropriate VMT tests to also lock (there were a few cases where locking was 
>> being bypassed) so they can pass the new asserts.
>> 
>> I also removed the unused `_query_lock` variable in `MemTracker`.
>> 
>> Testing: 
>> 
>> - On Linux amd64, I was able to consistently reproduce the errors described 
>> in [JDK-8343244](https://bugs.openjdk.org/browse/JDK-8343244) by increasing 
>> the number of test threads in 
>> `java/lang/Thread/jni/AttachCurrentThread/AttachTest.java`. The test 
>> consistently passes with the new changes in this PR.
>> - hotspot_nmt , gtest:VirtualSpace, tier1
>
> Robert Toyonaga has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Revert to using NmtVirtualMemoryLocker. Use defaultStream. Add comment for 
> SharedDecoder_lock

I had a few minor comments and hopefully easy changes to make.

src/hotspot/share/nmt/virtualMemoryTracker.cpp line 421:

> 419:   assert(_reserved_regions != nullptr, "Sanity check");
> 420:   assert(!MemTracker::is_done_bootstrap() || 
> NmtVirtualMemory_lock->owned_by_self() , "Should have acquired 
> NmtVirtualMemory_lock");
> 421: 

You could add this to MemTracker class like:
 ```
 inline static void assert_locked();


Then put the body  

void MemTracker::assert_locked() {
  assert(!is_bootstrapping_done() || NmtVirtualMemory_lock->owned_by_self(),
            "should have acquired NmtVirtualMemory_lock");
}

Then all these calls could be MemTracker::assert_locked().

src/hotspot/share/nmt/virtualMemoryTracker.cpp line 631:

> 629: 
> 630:   bool do_allocation_site(const ReservedMemoryRegion* rgn) {
> 631:     assert_lock_strong(NmtVirtualMemory_lock);

So this is past bootstrapping?

src/hotspot/share/runtime/mutexLocker.cpp line 292:

> 290:   MUTEX_DEFN(NMTCompilationCostHistory_lock  , PaddedMutex  , 
> nosafepoint);
> 291:   MUTEX_DEFN(NmtVirtualMemory_lock           , PaddedMutex  , 
> service-4); // Must be lower than G1Mapper_lock used from 
> G1RegionsSmallerThanCommitSizeMapper::commit_regions
> 292: #if INCLUDE_CDS

Yes, this looks good.  I see why you couldn't move this one to MUTEX_DEFL.

-------------

Changes requested by coleenp (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/22745#pullrequestreview-2517782112
PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1894080029
PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1894080796
PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1894084698

Reply via email to