The real purpose of this PR is to add virtual thread support to 
ThreadMemoryLeakTest.java, but this exposed bugs in both the debug agent and in 
TestScaffold, so those are being fixed also (and the debug agent bug is the CR 
being used).

The debug agent bug is due to a race condition during VM exit. The VM is in the 
process of shutting down. The debug agent has already disabled JVMTI callbacks 
and has sent the VMDeathEvent. At this point in time there are also threads 
exiting that the debug agent knows about, but it will not get a ThreadEndEvent 
for because of the callbacks being disabled. Thus these threads remain in the 
debug agent's list of known threads, even though they have exited. The debuggee 
receives the VMDeathEvent and does a VM.resume(). During the debug agent's 
handing of the VM.Resume command, it iterates over all known threads and needs 
to map each to its ThreadNode so it can be resumed, and this mapping requires 
accessing the JVMTI TLS for the thread. The problem is some of the threads may 
have exited already, and therefore no longer have TLS. This results in the 
assert in the debug agent. This debug agent issue was already addressed for 
platform threads, but not for virtual threads, which is why we s
 tarted seeing this issue when this test was modified. The fix is to just 
replicate what is done for platform threads for virtual threads also.

The TestScaffold bug is that if the debuggee crashes/asserts, this is likely to 
go unnoticed, especially if it happens during VM exit (and the test essentially 
has already completed). Because of this TestScaffold bug, the debug agent bug 
above did not result in a test failure. After fixing TestScaffold to check the 
exitCode of the debuggee process, the test started to appropriately fail until 
the debug agent was fixed.

One other thing to point out is the OOME issue I started getting frequently 
when testing with virtual threads. Since virtual threads are created at a much 
higher rate than platform threads, their creation started to overwhelm the 
debugger (actually the JDI implementation). There is already a mechanism in 
place to do a VM.HoldEvents if JDI has queue up 10,000 events. The problem is 
that events are coming in so fast that even after doing the VM.HoldEvents, the 
number of queued events continues to go up for a while, and sometimes reaches 
30,000 or more. This raises the peak memory usage of the test quite a bit. 
Since the test purposely uses a small heap so a memory leak is quickly and 
reliably detected, the large queue often results in an OOME. Because of this I 
make virtual threads sleep for 100ms instead of 50ms to slow down their 
creation, and this resolved the issue. 

I tested by running all of test/jdk/com/sun/jdi 25 times on each platform with 
and without virtual thread testing enabled.

-------------

Commit messages:
 - Support virtual thread testing.
 - Fix issues with missing virtual thread during VM shutdown

Changes: https://git.openjdk.org/jdk/pull/13246/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13246&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8305209
  Stats: 40 lines in 3 files changed: 35 ins; 0 del; 5 mod
  Patch: https://git.openjdk.org/jdk/pull/13246.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/13246/head:pull/13246

PR: https://git.openjdk.org/jdk/pull/13246

Reply via email to