Re: RFR: 8324868: debug agent does not properly handle interrupts of a virtual thread

Chris Plummer Fri, 23 Feb 2024 11:11:42 -0800

On Fri, 23 Feb 2024 19:06:59 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:


> Fix numerous debug agent issues and testing deficiencies related to 
> Thread.interrupt(). As a first read you should look at the description in the 
> CR. More details in first comment below.
> 
> Also, this PR is dependent on the fix for 
> [JDK-8325187](https://bugs.openjdk.org/browse/JDK-8325187). The test might 
> generate GHA failures in the meantime.
> 
> Tested with all svc tier1, tier2, and tier5 tests.

If Thread.interrupt() is called on a thread that is currently in the debug 
agent handling a JVMTI event, it can result in a RawMonitorWait() call 
returning with JVMTI_ERROR_INTERRUPT. This can also happen if the interrupt was 
already pending when the JVMTI event came in. When this happens the debug agent 
sets the ThreadNode->pendingInterrupt flag and re-enters the RawMonitorWait(). 
When the debug agent is done dealing with the event, it calls 
threadControl_onEventHandlerExit(), which calls doPendingTasks(), which will 
call JVMTI InterruptThread() to reset the interrupt status of the thread. There 
were a few bugs in this code related to virtual threads.

1st issue is that the call to JVMTI InterruptThread() resulted in an upcall to 
the java Thread.interrupt() while while holding threadLock. Problems arise when 
during this upcall the debug agent is re-entered due to an event. The case I 
ran into was due to single stepping, which was enabled by the 
InterruptHangTest.java. When the debug agent is re-entered, it grabs the 
handlerLock. This creates a deadlock potential because it already holds the 
threadLock, and should be grabbing the handlerLock before the threadLock. An 
indeed a deadlock did happen because another thread had already grabbed the 
handlerLock and was blocked trying to grab the threadLock.

My first attempt to fix this was to simply make 
threadControl_onEventHandlerExit() always grab both the threadLock and the 
handlerLock. Therefore when the debug agent was re-entered during the 
InterruptThread() call, it would already have both locks and not have a locking 
order issue. This seemed to work, but I didn't like the idea of the debug agent 
being re-entered with locks already held, so I reworked the code to no longer 
need the locks held during the InterruptThread() call.

A 2nd issue was found in threadControl_interrupt(). It is called when the 
ThreadReference.Interrupt JDWP command is issued. If the thread is currently 
handling an event, it is suppose to set ThreadNode->pendingInterrupt so the 
interrupt can be raised after handling the event is complete. However, it was 
calling findThread(&runningThreads), which means it would always fail to find a 
virtual thread. This ended up not really being an issue, because it would just 
call InterruptThread() immediatly as a result. As it turns out this is ok even 
if the thread is already handling an event because as described above, the 
debug agent already has code in place for this (the RawMonitorWait() retry 
code). So rather than fix the findThread() call I just made it so this code now 
always calls InterruptThread(), even if the thread is currently handling an 
event. Note this is tested by the new InterruptHangTest.java "remote" mode, 
which uses ThreadReference.interrupt().

A 3rd issue was in threadControl_currentThread(), which is used by 
handleInterrupt() to decide if threadControl_setPendingInterrupt() should be 
called to setup the deferred resetting of the interrupt. The problem was that 
threadControl_currentThread() was using findThread(&runningThreads), which 
won't find virtual threads. As a result of this, 
threadControl_setPendingInterrupt() was never being called for virtual threads, 
so the interrupted was lost. It could have been fixed by using 
findRunningThread(), but I decided to just instead use JVMTI to get the current 
thread. 

A 4th issue was threadControl_setPendingInterrupt(), which actually has 
multiple issues itself. The first is the comment, which is dated and no longer 
correct, so I removed it. Related to that is the assert, which I wasn't sure 
why it wasn't being triggered for virtual threads. The 
threadControl_currentThread() issue above ended up being the reason. Once that 
was fixed, the assert was triggered and needed to be removed. The 3rd is 
another incorrect call to findThread(&runningThreads), which won't find virtual 
threads. This was fixed to instead use findRunningThread(), which will find 
virtual threads. I also removed the check for a NULL node and instead added an 
assert since it should always be found.

The test has been reworked to have 3 modes now and do a much better job of 
testing. Read the long comment in the test for details.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17989#issuecomment-1961849739

Re: RFR: 8324868: debug agent does not properly handle interrupts of a virtual thread

Reply via email to