On Thu, 17 Oct 2024 14:28:30 GMT, Patricio Chilano Mateo 
<pchilanom...@openjdk.org> wrote:

> This is the implementation of JEP 491: Synchronize Virtual Threads without 
> Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for 
> further details.
> 
> In order to make the code review easier the changes have been split into the 
> following initial 4 commits:
> 
> - Changes to allow unmounting a virtual thread that is currently holding 
> monitors.
> - Changes to allow unmounting a virtual thread blocked on synchronized trying 
> to acquire the monitor.
> - Changes to allow unmounting a virtual thread blocked in `Object.wait()` and 
> its timed-wait variants.
> - Changes to tests, JFR pinned event, and other changes in the JDK libraries.
> 
> The changes fix pinning issues for all 4 ports that currently implement 
> continuations: x64, aarch64, riscv and ppc. Note: ppc changes were added 
> recently and stand in its own commit after the initial ones.
> 
> The changes fix pinning issues when using `LM_LIGHTWEIGHT`, i.e. the default 
> locking mode, (and `LM_MONITOR` which comes for free), but not when using 
> `LM_LEGACY` mode. Note that the `LockingMode` flag has already been 
> deprecated ([JDK-8334299](https://bugs.openjdk.org/browse/JDK-8334299)), with 
> the intention to remove `LM_LEGACY` code in future releases.
> 
> 
> ## Summary of changes
> 
> ### Unmount virtual thread while holding monitors
> 
> As stated in the JEP, currently when a virtual thread enters a synchronized 
> method or block, the JVM records the virtual thread's carrier platform thread 
> as holding the monitor, not the virtual thread itself. This prevents the 
> virtual thread from being unmounted from its carrier, as ownership 
> information would otherwise go wrong. In order to fix this limitation we will 
> do two things:
> 
> - We copy the oops stored in the LockStack of the carrier to the stackChunk 
> when freezing (and clear the LockStack). We copy the oops back to the 
> LockStack of the next carrier when thawing for the first time (and clear them 
> from the stackChunk). Note that we currently assume carriers don't hold 
> monitors while mounting virtual threads.
> 
> - For inflated monitors we now record the `java.lang.Thread.tid` of the owner 
> in the ObjectMonitor's `_owner` field instead of a JavaThread*. This allows 
> us to tie the owner of the monitor to a `java.lang.Thread` instance, rather 
> than to a JavaThread which is only created per platform thread. The tid is 
> already a 64 bit field so we can ignore issues of the counter wrapping around.
> 
> #### General notes about this part:
> 
> - Since virtual threads don't need to worry about holding monitors anymo...

I've done a first pass over the first commit and have some comments and 
questions.

Round 2.  There are a lot of very helpful comments in the new code to explain 
what it's doing but I have some requests for some more.  And some questions.

Some more comments and questions on the latest commit, mostly minor.

I've traced through the runtime code (minus calculations for continuations) and 
found some typos on the way.  Excellent piece of work.

> Then I looked at typing up the thread / lock ids as an enum class 
> https://github.com/openjdk/jdk/commit/34221f4a50a492cad4785cfcbb4bef8fa51d6f23

Both of these suggested changes should be discussed as different RFEs.  I don't 
really like this ThreadID change because it seems to introduce casting 
everywhere.

Noticed while downloading this that some copyrights need updating.

src/hotspot/cpu/aarch64/continuationFreezeThaw_aarch64.inline.hpp line 135:

> 133:   assert(*f.addr_at(frame::interpreter_frame_last_sp_offset) == 0, 
> "should be null for top frame");
> 134:   intptr_t* lspp = f.addr_at(frame::interpreter_frame_last_sp_offset);
> 135:   *lspp = f.unextended_sp() - f.fp();

Can you write a comment what this is doing briefly and why?

src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1550:

> 1548: #endif /* ASSERT */
> 1549: 
> 1550:   push_cont_fastpath();

One of the callers of this gives a clue what it does.

  __ push_cont_fastpath(); // Set JavaThread::_cont_fastpath to the sp of the 
oldest interpreted frame we know about

Why do you do this here?  Oh please more comments...

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5354:

> 5352:   str(rscratch2, dst);
> 5353:   Label ok;
> 5354:   tbz(rscratch2, 63, ok);

63?  Does this really need to have underflow checking?  That would alleviate 
the register use concerns if it didn't.  And it's only for legacy locking which 
should be stable until it's removed.

src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 2032:

> 2030:     // Force freeze slow path in case we try to preempt. We will pin the
> 2031:     // vthread to the carrier (see 
> FreezeBase::recurse_freeze_native_frame()).
> 2032:     __ push_cont_fastpath();

We need to do this because we might freeze, so JavaThread::_cont_fastpath 
should be set in case we do?

src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 2629:

> 2627:   addi(temp, displaced_header, in_bytes(ObjectMonitor::owner_offset()) 
> - markWord::monitor_value);
> 2628:   Register thread_id = displaced_header;
> 2629:   ld(thread_id, in_bytes(JavaThread::lock_id_offset()), R16_thread);

Maybe to make things really clear, you could call this thread_lock_id ?  Seems 
to be used consistently as thread_id in much of the platform code.

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 231:

> 229: 
> 230: void MacroAssembler::inc_held_monitor_count(Register tmp) {
> 231:   Address dst = Address(xthread, 
> JavaThread::held_monitor_count_offset());

Address dst(xthread, JavaThread::held_monitor_count_offset());

src/hotspot/share/interpreter/oopMapCache.cpp line 268:

> 266:   }
> 267: 
> 268:   int num_oops() { return _num_oops; }

I can't find what uses this from OopMapCacheEntry.

src/hotspot/share/oops/stackChunkOop.inline.hpp line 189:

> 187: inline ObjectMonitor* stackChunkOopDesc::current_pending_monitor() const 
> {
> 188:   ObjectWaiter* waiter = object_waiter();
> 189:   if (waiter != nullptr && (waiter->is_monitorenter() || 
> (waiter->is_wait() && (waiter->at_reenter() || waiter->notified())))) {

Can we hide this conditional under ObjectWaiter::pending_monitor() { all this 
stuff with a comment; }

Not sure what this is excluding.

src/hotspot/share/runtime/continuation.cpp line 89:

> 87:       // we would incorrectly throw it during the unmount logic in the 
> carrier.
> 88:       if (_target->has_async_exception_condition()) {
> 89:         _failed = false;

This says "Don't" but then failed is false which doesn't make sense.  Should it 
be true?

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 1275:

> 1273: 
> 1274:   if (caller.is_interpreted_frame()) {
> 1275:     _total_align_size += frame::align_wiggle;

Please put a comment here about frame align-wiggle.

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 1278:

> 1276:   }
> 1277: 
> 1278:   patch(f, hf, caller, false /*is_bottom_frame*/);

I also forgot what patch does.  Can you add a comment here too?

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 1550:

> 1548:   assert(!cont.is_empty(), "");
> 1549:   // This is done for the sake of the enterSpecial frame
> 1550:   StackWatermarkSet::after_unwind(thread);

Is there a new place for this StackWatermark code?

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 1657:

> 1655: }
> 1656: 
> 1657: template<typename ConfigT, bool preempt>

This function is kind of big, do we really want it duplicated to pass preempt 
as a template parameter?

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 2235:

> 2233:       assert(!mon_acquired || mon->has_owner(_thread), "invariant");
> 2234:       if (!mon_acquired) {
> 2235:         // Failed to aquire monitor. Return to enterSpecial to unmount 
> again.

typo: acquire

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 2492:

> 2490: void ThawBase::throw_interrupted_exception(JavaThread* current, frame& 
> top) {
> 2491:   ContinuationWrapper::SafepointOp so(current, _cont);
> 2492:   // Since we might safepoint set the anchor so that the stack can we 
> walked.

typo: can be walked

src/hotspot/share/runtime/javaThread.cpp line 2002:

> 2000: #ifdef SUPPORT_MONITOR_COUNT
> 2001: 
> 2002: #ifdef LOOM_MONITOR_SUPPORT

If LOOM_MONITOR_SUPPORT is not true, this would skip this block and assert for 
LIGHTWEIGHT locking. Do we need this #ifdef ?

src/hotspot/share/runtime/javaThread.hpp line 334:

> 332:   bool                  _pending_jvmti_unmount_event;    // When 
> preempting we post unmount event at unmount end rather than start
> 333:   bool                  _on_monitor_waited_event;        // Avoid callee 
> arg processing for enterSpecial when posting waited event
> 334:   ObjectMonitor*        _contended_entered_monitor;      // Monitor por 
> pending monitor_contended_entered callback

typo: Monitor **for** pending_contended_entered callback

src/hotspot/share/runtime/objectMonitor.cpp line 416:

> 414:     set_owner_from_BasicLock(cur, current);  // Convert from BasicLock* 
> to Thread*.
> 415:     return true;
> 416:   }

Not needed? Oh I see, BasicLock is now in stack_locker.

src/hotspot/share/runtime/objectMonitor.cpp line 876:

> 874:   // and in doing so avoid some transitions ...
> 875: 
> 876:   // For virtual threads that are pinned do a timed-park instead, to

I had trouble parsing this first sentence.  I think it needs a comma after 
pinned and remove the comma after instead.

src/hotspot/share/runtime/objectMonitor.cpp line 1014:

> 1012:   assert_mark_word_consistency();
> 1013:   UnlinkAfterAcquire(current, currentNode);
> 1014:   if (is_succesor(current)) clear_succesor();

successor has two 's'.

src/hotspot/share/runtime/objectMonitor.cpp line 1158:

> 1156:     if (LockingMode != LM_LIGHTWEIGHT && 
> current->is_lock_owned((address)cur)) {
> 1157:       assert(_recursions == 0, "invariant");
> 1158:       set_owner_from_BasicLock(cur, current);  // Convert from 
> BasicLock* to Thread*.

This is nice you don't have to do this anymore.

src/hotspot/share/runtime/objectMonitor.cpp line 2305:

> 2303: }
> 2304: 
> 2305: void ObjectMonitor::Initialize2() {

Can you put a comment why there's a second initialize function?  Presumably 
after some state is set.

src/hotspot/share/runtime/objectMonitor.hpp line 43:

> 41: // ParkEvent instead.  Beware, however, that the JVMTI code
> 42: // knows about ObjectWaiters, so we'll have to reconcile that code.
> 43: // See next_waiter(), first_waiter(), etc.

Also a nice cleanup.  Did you reconcile the JVMTI code?

src/hotspot/share/runtime/objectMonitor.hpp line 71:

> 69:   bool is_wait()           { return _is_wait; }
> 70:   bool notified()          { return _notified; }
> 71:   bool at_reenter()        { return _at_reenter; }

should these be const member functions?

-------------

PR Review: https://git.openjdk.org/jdk/pull/21565#pullrequestreview-2386614214
PR Review: https://git.openjdk.org/jdk/pull/21565#pullrequestreview-2390813935
PR Review: https://git.openjdk.org/jdk/pull/21565#pullrequestreview-2396572570
Marked as reviewed by coleenp (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/21565#pullrequestreview-2405734604
PR Comment: https://git.openjdk.org/jdk/pull/21565#issuecomment-2430528701
PR Comment: https://git.openjdk.org/jdk/pull/21565#issuecomment-2442058307
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1813899129
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1814081166
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1811590155
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1814084085
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1811591482
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1811595282
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1817407075
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1823088425
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1814905064
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815015410
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815016232
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815245735
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815036910
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1823233359
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1823252062
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1811611376
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1823091373
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1811613400
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815445109
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1811614453
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1817415918
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815479877
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1817419797
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1817420178

Reply via email to