On Thu, 30 Mar 2023 16:03:05 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:
> The real purpose of this PR is to add virtual thread support to > ThreadMemoryLeakTest.java, but this exposed bugs in both the debug agent and > in TestScaffold, so those are being fixed also (and the debug agent bug is > the CR being used). > > The debug agent bug is due to a race condition during VM exit. The VM is in > the process of shutting down. The debug agent has already disabled JVMTI > callbacks and has sent the VMDeathEvent. At this point in time there are also > threads exiting that the debug agent knows about, but it will not get a > ThreadEndEvent for because of the callbacks being disabled. Thus these > threads remain in the debug agent's list of known threads, even though they > have exited. The debuggee receives the VMDeathEvent and does a VM.resume(). > During the debug agent's handing of the VM.Resume command, it iterates over > all known threads and needs to map each to its ThreadNode so it can be > resumed, and this mapping requires accessing the JVMTI TLS for the thread. > The problem is some of the threads may have exited already, and therefore no > longer have TLS. This results in the assert in the debug agent. This debug > agent issue was already addressed for platform threads, but not for virtual > threads, which is why we started seeing this issue when this test was modified. The fix is to just replicate what is done for platform threads for virtual threads also. > > The TestScaffold bug is that if the debuggee crashes/asserts, this is likely > to go unnoticed, especially if it happens during VM exit (and the test > essentially has already completed). Because of this TestScaffold bug, the > debug agent bug above did not result in a test failure. After fixing > TestScaffold to check the exitCode of the debuggee process, the test started > to appropriately fail until the debug agent was fixed. > > One other thing to point out is the OOME issue I started getting frequently > when testing with virtual threads. Since virtual threads are created at a > much higher rate than platform threads, their creation started to overwhelm > the debugger (actually the JDI implementation). There is already a mechanism > in place to do a VM.HoldEvents if JDI has queue up 10,000 events. The problem > is that events are coming in so fast that even after doing the VM.HoldEvents, > the number of queued events continues to go up for a while, and sometimes > reaches 30,000 or more. This raises the peak memory usage of the test quite a > bit. Since the test purposely uses a small heap so a memory leak is quickly > and reliably detected, the large queue often results in an OOME. Because of > this I make virtual threads sleep for 100ms instead of 50ms to slow down > their creation, and this resolved the issue. > > I tested by running all of test/jdk/com/sun/jdi 25 times on each platform > with and without virtual thread testing enabled. This pull request has now been integrated. Changeset: 1d517afb Author: Chris Plummer <cjplum...@openjdk.org> URL: https://git.openjdk.org/jdk/commit/1d517afbd4547171ad6fb6a3356351c2554c8279 Stats: 33 lines in 3 files changed: 28 ins; 0 del; 5 mod 8305209: JDWP exit error AGENT_ERROR_INVALID_THREAD(203): missing entry in running thread table Reviewed-by: sspitsyn, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/13246