On Fri, 9 Jan 2026 12:05:37 GMT, Anton Artemov <[email protected]> wrote:
>> Hi, please consider the following changes: >> >> If suspension is allowed when a thread is re-entering an object monitor >> (OM), then a following liveness issues can happen in the >> `ObjectMonitor::wait()` method. >> >> The waiting thread is made to be a successor and is unparked. Upon a >> suspension request, the thread will suspend itself whilst clearing the >> successor. The OM will be left unlocked (not grabbed by any thread), while >> the other threads are parked until a thread grabs the OM and the exits it. >> The suspended thread is on the entry-list and can be selected as a successor >> again. None of other threads can be woken up to grab the OM until the >> suspended thread has been resumed and successfully releases the OM. >> >> This can happen in three places where the successor could be suspended: >> >> 1: >> https://github.com/openjdk/jdk/blob/6322aaba63b235cb6c73d23a932210af318404ec/src/hotspot/share/runtime/objectMonitor.cpp#L1897 >> >> 2: >> https://github.com/openjdk/jdk/blob/6322aaba63b235cb6c73d23a932210af318404ec/src/hotspot/share/runtime/objectMonitor.cpp#L1149 >> >> 3: >> https://github.com/openjdk/jdk/blob/6322aaba63b235cb6c73d23a932210af318404ec/src/hotspot/share/runtime/objectMonitor.cpp#L1951 >> >> The issues are addressed by not allowing suspension in case 1, and by >> handling the suspension request at a later stage, after the thread has >> grabbed the OM in `reenter_internal()` in case 2. In case of a suspension >> request, the thread exits the OM and enters it again once resumed. >> >> Case 3 is handled by not transferring a thread to the `entry_list` in >> `notify_internal()` in case the corresponding JVMTI event is allowed. >> Instead, a tread is unparked and let run. Since it is not on the >> `entry_list`, it will not be chosen as a successor and it is no harm to >> suspend it if needed when posting the event. >> >> Possible issue of posting a `waited` event while still be suspended is >> addressed by adding a suspension check just before the posting of event. >> >> Tests are added. >> >> Tested in tiers 1 - 7. > > Anton Artemov has updated the pull request with a new target base due to a > merge or a rebase. The pull request now contains 44 commits: > > - 8366659: Fixed year in the copyright. > - Merge remote-tracking branch 'origin/master' into > JDK-8366659-OM-wait-suspend-deadlock > - 8366659: Removed ClearSuccOnSuspend > - 8366659: Fixed liveness problem. > - Merge remote-tracking branch 'origin/master' into > JDK-8366659-OM-wait-suspend-deadlock > - 8366659: Fixed build problem. > - 8366659: Fixed build issue. > - 8366659: Changed the way how notify_internal works if JVMTI monitor waited > event allowed. > - Merge remote-tracking branch 'origin/master' into > JDK-8366659-OM-wait-suspend-deadlock > - 8366659: Fixed semi-broken tests > - ... and 34 more: https://git.openjdk.org/jdk/compare/a01283a5...21b83214 I'm struggling to map code changes to the description of the changes. I think a lot more comments are needed to describe things more clearly. The three testcases need a clearer description - and '1' '2' '3' are not great names. Given we are in the SuspendWithObjectMonitorWait directory we do not need that prefix on all the file names - we can use shorter but more descriptive names for what is actually being tested. Thanks src/hotspot/share/runtime/objectMonitor.cpp line 1907: > 1905: // That is, we fail toward safety. > 1906: > 1907: was_notified = true; This isn't quite right. If we would reach line 1882 then we were not notified, so we cannot unconditionally set this to true here. I think you need to initialize to true at line 1868 and then set to false at line 1882 (deleting the current comment). src/hotspot/share/runtime/objectMonitor.cpp line 1917: > 1915: } > 1916: > 1917: // The thread is now either on off-list (TS_RUN), Suggestion: // The thread is now either off-list (TS_RUN), src/hotspot/share/runtime/objectMonitor.cpp line 1932: > 1930: // (Don't cache naked oops over safepoints, of course). > 1931: > 1932: // post monitor waited event. Note that this is past-tense, we are > done waiting. Suggestion: // Post monitor waited event. Note that this is past-tense, we are done waiting. You need to expand the comment to explain the significance of checking for TS_ENTER. src/hotspot/share/runtime/objectMonitor.cpp line 1967: > 1965: // We can go to a safepoint at the end of this block. If we > 1966: // do a thread dump during that safepoint, then this thread > will show > 1967: // as having "-locked" the monitor, but the OS and > java.lang.Thread To be clear, if it is a plain-old safepoint, we will show the monitor as locked because it is. But if it happens we were suspended, then we will release the monitor and so we wouldn't report locked. src/hotspot/share/runtime/objectMonitor.cpp line 1973: > 1971: // and set the OM as pending. > 1972: } > 1973: // Done waiting, post the corresponding event What event? I expected to see `JvmtiExport::post_monitor_waited` here. ?? src/hotspot/share/runtime/objectMonitor.cpp line 2053: > 2051: // Adding to _entry_list uses Atomic::cmpxchg() which already > provides > 2052: // a fence that prevents reordering of the stores. > 2053: if (!JvmtiExport::should_post_monitor_waited()) { You need to explain why this is now conditional. src/hotspot/share/runtime/objectMonitor.cpp line 2063: > 2061: if (!JvmtiExport::should_post_monitor_waited()) { > 2062: add_to_entry_list(current, iterator); > 2063: } else { You need to explain what each of these paths is doing. At this point I'm lost. src/hotspot/share/runtime/objectMonitor.cpp line 2069: > 2067: > 2068: oop vthread = nullptr; > 2069: ParkEvent* Trigger; I assume this is copied from elsewhere but we should use proper style rules in the new code i.e. local variables don't start with a capital. Thanks. Also "trigger" doesn't really convet anythng meaningful to me - `evt` for event seems fine. src/hotspot/share/runtime/objectMonitor.cpp line 2277: > 2275: node->_at_reenter = true; > 2276: > 2277: if (state == ObjectWaiter::TS_RUN) { I'm finding the changes to `was_notified` versus use of explicit state checks very confusing. I keep asking myself which state means I was notified and which does not. test/hotspot/jtreg/serviceability/jvmti/SuspendWithObjectMonitorWait/SuspendWithObjectMonitorWait1.java line 71: > 69: public int doWork1(int timeMax, PrintStream out) { > 70: SuspendWithObjectMonitorWaitWorker waiter; // waiter thread > 71: SuspendWithObjectMonitorWaitWorker resumer; // resumer thread Suggestion: SuspendWithObjectMonitorWaitWorker waiter; // waiter thread SuspendWithObjectMonitorWaitWorker resumer; // resumer thread ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/27040#pullrequestreview-3675924710 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702902497 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702905495 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702909290 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702947656 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702958092 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702960742 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702965432 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702967605 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702974131 PR Review Comment: https://git.openjdk.org/jdk/pull/27040#discussion_r2702975301
