On Wed, 7 Feb 2024 07:02:11 GMT, Serguei Spitsyn <sspit...@openjdk.org> wrote:
>> The implementation of the JVM TI `GetObjectMonitorUsage` does not match the >> spec. >> The function returns the following structure: >> >> >> typedef struct { >> jthread owner; >> jint entry_count; >> jint waiter_count; >> jthread* waiters; >> jint notify_waiter_count; >> jthread* notify_waiters; >> } jvmtiMonitorUsage; >> >> >> The following four fields are defined this way: >> >> waiter_count [jint] The number of threads waiting to own this monitor >> waiters [jthread*] The waiter_count waiting threads >> notify_waiter_count [jint] The number of threads waiting to be notified by >> this monitor >> notify_waiters [jthread*] The notify_waiter_count threads waiting to be >> notified >> >> The `waiters` has to include all threads waiting to enter the monitor or to >> re-enter it in `Object.wait()`. >> The implementation also includes the threads waiting to be notified in >> `Object.wait()` which is wrong. >> The `notify_waiters` has to include all threads waiting to be notified in >> `Object.wait()`. >> The implementation also includes the threads waiting to re-enter the monitor >> in `Object.wait()` which is wrong. >> This update makes it right. >> >> The implementation of the JDWP command `ObjectReference.MonitorInfo (5)` is >> based on the JVM TI `GetObjectMonitorInfo()`. This update has a tweak to >> keep the existing behavior of this command. >> >> The follwoing JVMTI vmTestbase tests are fixed to adopt to the >> `GetObjectMonitorUsage()` correct behavior: >> >> jvmti/GetObjectMonitorUsage/objmonusage001 >> jvmti/GetObjectMonitorUsage/objmonusage003 >> >> >> The following JVMTI JCK tests have to be fixed to adopt to correct behavior: >> >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00101/gomu00101a.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102.html >> vm/jvmti/GetObjectMonitorUsage/gomu001/gomu00102/gomu00102a.html >> >> >> >> A JCK bug will be filed and the tests have to be added into the JCK problem >> list located in the closed repository. >> >> Also, please see and review the related CSR: >> [8324677](https://bugs.openjdk.org/browse/JDK-8324677): incorrect >> implementation of JVM TI GetObjectMonitorUsage >> >> The Release-Note is: >> [8325314](https://bugs.openjdk.org/browse/JDK-8325314): Release Note: >> incorrect implementation of JVM TI GetObjectMonitorUsage >> >> Testing: >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a > merge or a rebase. The incremental webrev excludes the unrelated changes > brought in by the merge/rebase. The pull request contains three additional > commits since the last revision: > > - Merge > - review: thread in notify waiter list can't be BLOCKED > - 8324677: Specification clarification needed for JVM TI > GetObjectMonitorUsage Okay... so we might have an incorrect implementation of JVM TI GetObjectMonitorUsage, but I'm not convinced that it is broken in the way that we think it is broken. Please read thru my unfortunately long comments on the complicated logic in `JvmtiEnvBase::get_object_monitor_usage()`. I _think_ things are broken in a different way than what we're talking about in this bug. In particular, the logic that populates the `waiters` array and reduces the `waiter_count` value overrides the value that we think we're fixing in the beginning of the function. Similarly, the logic that populates the `notify_waiters` array and reduces the `notify_waiter_count` value also overrides the value that we think we're fixing in the beginning of the function. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1489: > 1487: nWait = mon->waiters(); // # of threads in Object.wait() > 1488: ret.waiter_count = nWant; > 1489: ret.notify_waiter_count = nWait; Please note that the comment on L1487 is accurate. The `waiters` count is incremented just before the thread that has called `wait()` drops the monitor and that increased count remains in effect until after the thread has reentered the monitor after being notified or the wait timeout has expired. The `contentions` count is not incremented in the re-entry path so a thread that is in `wait()` that has been notified or the wait timeout has expired is not counted as a contending thread. So if we really want the `waiter_count` field to include threads that have been notified or the wait timeout has expired, then we have to add some logic to carefully increment the `contentions` count. This old logic: ``` ret.waiter_count = nWant + nWait;``` over counts because it also includes threads in `wait()` that have not yet been notified and the wait timeout has not expired. However, including `nWait` is correct for the situation when all of the waiting threads have been notified or their wait timeouts have expired. This also means that `nWait` and the `ret.notify_waiter_count` value are over counting waiters because that value will include threads that (have been notified or the wait timeout has expired) AND have not reentered the monitor. I _think_ we're saying that is NOT what we want. However, I _think_ that the `nWait` value is fixed below when we're gathering up the list of waiters. src/hotspot/share/prims/jvmtiEnvBase.cpp line 1521: > 1519: // we have contending and/or waiting threads > 1520: if (nWant > 0) { > 1521: // we have contending threads This block includes this logic: // get_pending_threads returns only java thread so we do not need to // check for non java threads. GrowableArray<JavaThread*>* wantList = Threads::get_pending_threads(tlh.list(), nWant, (address)mon); if (wantList->length() < nWant) { // robustness: the pending list has gotten smaller nWant = wantList->length(); } `Threads::get_pending_threads()` only returns threads where the `current_pending_monitor` field is set for the specific monitor. That happens only on contended enter and does not happen on contended re-enter so this logic will already strip out any threads in `wait()` that have not been notified and have not had their wait timers expire. It will also strip out any waiters that have been notified or had their wait timeouts expire. This means even if we fix the reenter code to increment the contentions field, this logic will reduce that `nWant` value. Of course, the way around that is to also update the reenter code to properly set the `current_pending_monitor` field and then the reentering threads won't be filtered out... src/hotspot/share/prims/jvmtiEnvBase.cpp line 1560: > 1558: // Adjust count. nWant and nWait count values may be less than > original. > 1559: ret.waiter_count = nWant; > 1560: ret.notify_waiter_count = nWait; This old logic: ``` ret.waiter_count = nWant + nWait;``` over counts because it also includes threads in wait() that have not yet been notified and the wait timeout has not expired. However, including nWait is correct for the situation when all of the waiting threads have been notified or their wait timeouts have expired. If we have fixed the increment of contentions like the comment above mentions, then `nWant` will properly reflect the semantics that I think we're talking about. When we're gathering up the entries for `waiters` above, we traverse the list of waiters and we reduce the `nWait` when the list is shorter than what we expected earlier. See L1540 -> L1544. test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage001/objmonusage001.cpp line 95: > 93: > 94: JNIEXPORT void JNICALL > 95: Java_nsk_jvmti_GetObjectMonitorUsage_objmonusage001_check(JNIEnv *env, I would have expected a modification to an objmonusage001.java file to update the parameters to that test's `check()` function. test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage001/objmonusage001.cpp line 161: > 159: > 160: if (inf.notify_waiter_count != notifyWaiterCount) { > 161: printf("(%d) waiter_count expected: %d, actually: %d\n", nit typo: s/waiter_count/notify_waiter_count/ test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetObjectMonitorUsage/objmonusage003/objmonusage003.cpp line 150: > 148: > 149: if (inf.notify_waiter_count != notifyWaiterCount) { > 150: printf("(%d) waiter_count expected: %d, actually: %d\n", nit typo: s/waiter_count/notify_waiter_count/ ------------- Changes requested by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17680#pullrequestreview-1868875636 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1482144591 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1482180688 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1482155914 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1482189539 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1482185062 PR Review Comment: https://git.openjdk.org/jdk/pull/17680#discussion_r1482185460