On Wed, 1 May 2024 21:32:46 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:

> This PR adds ranked monitor support to the debug agent. The debug agent has a 
> large number of monitors, and it's really hard to know which order to grab 
> them in, and for that matter which monitors might already be held at any 
> given moment. By imposing a rank on each monitor, we can check to make sure 
> they are always grabbed in the order of their rank. Having this in place when 
> I was working on [JDK-8324868](https://bugs.openjdk.org/browse/JDK-8324868) 
> would have made it much easier to detect a deadlock that was occuring, and 
> the reason for it. That's what motivated me to do this work
> 
> There were 2 or 3 minor rank issues discovered as a result of these changes. 
> I also learned a lot about some of the more ugly details of the locking 
> support in the process.
> 
> Tested with the following on all supported platforms and with virtual threads:
> 
> com/sun/jdi
> vmTestbase/nsk/jdi
> vmTestbase/nsk/jdb
> vmTestbase/nsk/jdwp 
> 
> Still need to run tier2 and tier5.
> 
> Details of the changes follow in the first comment.

src/jdk.jdwp.agent/share/native/libjdwp/threadControl.c line 656:

> 654:     commonRef_lock();
> 655:     if (gdata->rankedMonitors) {
> 656:         dbgRawMonitor_lock();

What is this call for? I think dbgRawMonitor_lock/dbgRawMonitor_unlock should 
not be used outside of util.c (I'd remove them from util.h)

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1204:

> 1202:     // Need to lock during initialization so verifyMonitorRank() can be 
> guaranteed that
> 1203:     // if the monitor field is set, then the monitor if fully 
> initialized. More specifically,
> 1204:     // that the rank field has been set.

rank for all monitors can be set in `util_initialize()`:

for (i = 0; i < NUM_DEBUG_RAW_MONITORS; i++) {
  dbg_monitors[i]->rank = i;
}

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1231:

> 1229:     }
> 1230: 
> 1231:     dbg_monitor->monitor = NULL;

I think it would be better to protect this with dbgRawMonitor

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1243:

> 1241:     error = JVMTI_FUNC_PTR(gdata->jvmti,GetThreadInfo)
> 1242:             (gdata->jvmti, thread, &info);
> 1243:     return info.name;

Need to delete JNI reference info.thread_group and info.jthreadGroup (if not 
NULL)

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1262:

> 1260:     for (i = 0; i < NUM_DEBUG_RAW_MONITORS; i++) {
> 1261:         DebugRawMonitor* dbg_monitor = &dbg_monitors[i];
> 1262:         if (dbg_monitor == NULL) {

Should be `dbg_monitor->monitor`

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1279:

> 1277:         return; // This assert is not reliable if the VM is exiting
> 1278:     }
> 1279:     jthread current_thread = threadControl_currentThread();

Looks like all callers of the `assertIsCurrentThread()` have reference to the 
current thread. Maybe pass it as argument?

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1281:

> 1279:     jthread current_thread = threadControl_currentThread();
> 1280:     if (!isSameObject(env, thread, current_thread)) {
> 1281:         tty_message("ERROR: Threads not the same: %p %p\n", thread, 
> current_thread);

Not sure the addresses provide useful information. Maybe print thread names?

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1289:

> 1287: 
> 1288: static void
> 1289: verifyMonitorRank(JNIEnv *env, DebugRawMonitorRank rank, jthread thread)

please add a comment that the function should be called under dbgRawMonitor 
lock (also for other functions which assume this)

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1294:

> 1292:     // has a higher rank than the monitor we are about to enter.
> 1293:     DebugRawMonitorRank i;
> 1294:     for (i = 0; i < NUM_DEBUG_RAW_MONITORS; i++) {

No need to check monitors from the beginning.
Should be enough to check starting from `min(rank, 
FIRST_LEAF_DEBUG_RAW_MONITOR)`

src/jdk.jdwp.agent/share/native/libjdwp/util.c line 1470:

> 1468: 
> 1469:     // Assert that the current thread owns this monitor.
> 1470:     assertIsCurrentThread(getEnv(), dbg_monitor->ownerThread);

reading ownerThread (and other) fields without dbgRawMonitor is not MT-safe 
(it's not atomic)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586999913
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1588259933
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586968454
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586970362
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586973059
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586983415
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586984623
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1588206582
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1586976698
PR Review Comment: https://git.openjdk.org/jdk/pull/19044#discussion_r1588252771

Reply via email to