On Tue, 7 May 2024 18:53:23 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:
>> This PR adds ranked monitor support to the debug agent. The debug agent has >> a large number of monitors, and it's really hard to know which order to grab >> them in, and for that matter which monitors might already be held at any >> given moment. By imposing a rank on each monitor, we can check to make sure >> they are always grabbed in the order of their rank. Having this in place >> when I was working on >> [JDK-8324868](https://bugs.openjdk.org/browse/JDK-8324868) would have made >> it much easier to detect a deadlock that was occuring, and the reason for >> it. That's what motivated me to do this work >> >> There were 2 or 3 minor rank issues discovered as a result of these changes. >> I also learned a lot about some of the more ugly details of the locking >> support in the process. >> >> Tested with the following on all supported platforms and with virtual >> threads: >> >> com/sun/jdi >> vmTestbase/nsk/jdi >> vmTestbase/nsk/jdb >> vmTestbase/nsk/jdwp >> >> Still need to run tier2 and tier5. >> >> Details of the changes follow in the first comment. > > Chris Plummer has updated the pull request incrementally with two additional > commits since the last revision: > > - Fix jcheck extra whitespace. > - Fix comment typo. src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c line 66: > 64: debugLoop_initialize(void) > 65: { > 66: vmDeathLock = debugMonitorCreate(vmDeathLockForDebugLoop_Rank, "JDWP > VM_DEATH Lock"); Nit: Just a suggestion to consider... Can all the `debugMonitor's` created by one function `debugMonitor_initialize()` called by initialize() at start . Then each monitor consumer can cache needed `debugMonitor's` like this: vmDeathLock = getDebugMonitor(vmDeathLockForDebugLoop_Rank); src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1117: > 1115: jrawMonitorID monitor; > 1116: const char* name; > 1117: DebugRawMonitorRank rank; Nit: Please, consider to add same comment for lines #1116-1117 as for the line #1119 below. It might be useful to move the field `ownerThread` before the other fields (not sure about the `monitor` field), so the same comment can be shared by multiple fields. src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1297: > 1295: // We must hold the dbgRawMonitor when calling verifyMonitorRank() > 1296: > 1297: // Iterate over all the monitors and makes sure we don't already > hold one that Nit: Typo: "makes sure" => "make sure". src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1343: > 1341: } > 1342: } > 1343: } Nit: I'd suggest to consider two separate loops instead of one, something like below: static void assertOrderFailure(jthread thread, DebugRawMonitorRank rank, DebugRawMonitor* dbg_monitor, char* msg) { char* threadName = getThreadName(thread); tty_message("DebugRawMonitor %s failure: (%s: %d > %d) for thread (%s)", msg, dbg_monitor->name, dbg_monitor->rank, rank, threadName); jvmtiDeallocate(threadName); JDI_ASSERT(JNI_FALSE); } static void verifyMonitorRank(JNIEnv *env, DebugRawMonitorRank rank, jthread thread) { DebugRawMonitorRank i; // 1. Verify the held monitor's rank is lower than the rank of the monitor we are trying to enter. for (i = rank + 1; i < FIRST_LEAF_DEBUG_RAW_MONITOR; i++) { DebugRawMonitor* dbg_monitor = &dbg_monitors[i]; if (dbg_monitor->ownerThread != NULL && isSameObject(env, dbg_monitor->ownerThread, thread)) { // the lower ranked monitor is already owned by this thread assertOrderFailure(thread, rank, dbg_monitor, "RANK ORDER"); } } // 2. Verify there are no other leaf monitors owned but this thread. for (i = FIRST_LEAF_DEBUG_RAW_MONITOR; i < NUM_DEBUG_RAW_MONITORS; i++) { DebugRawMonitor* dbg_monitor = &dbg_monitors[i]; if (i != rank && isSameObject(env, dbg_monitor->ownerThread, thread)) { // the leaf ranked monitor is already owned by this thread assertOrderFailure(thread, rank, dbg_monitor, "LEAF RANK"); } } I hope, this can be tweaked to adopt the `popFrame` locks correction. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1594254039 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1594328246 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1594257313 PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1594285396