On Wed, 23 Oct 2024 22:59:19 GMT, Coleen Phillimore <cole...@openjdk.org> wrote:

>> This is the implementation of JEP 491: Synchronize Virtual Threads without 
>> Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for 
>> further details.
>> 
>> In order to make the code review easier the changes have been split into the 
>> following initial 4 commits:
>> 
>> - Changes to allow unmounting a virtual thread that is currently holding 
>> monitors.
>> - Changes to allow unmounting a virtual thread blocked on synchronized 
>> trying to acquire the monitor.
>> - Changes to allow unmounting a virtual thread blocked in `Object.wait()` 
>> and its timed-wait variants.
>> - Changes to tests, JFR pinned event, and other changes in the JDK libraries.
>> 
>> The changes fix pinning issues for all 4 ports that currently implement 
>> continuations: x64, aarch64, riscv and ppc. Note: ppc changes were added 
>> recently and stand in its own commit after the initial ones.
>> 
>> The changes fix pinning issues when using `LM_LIGHTWEIGHT`, i.e. the default 
>> locking mode, (and `LM_MONITOR` which comes for free), but not when using 
>> `LM_LEGACY` mode. Note that the `LockingMode` flag has already been 
>> deprecated ([JDK-8334299](https://bugs.openjdk.org/browse/JDK-8334299)), 
>> with the intention to remove `LM_LEGACY` code in future releases.
>> 
>> 
>> ## Summary of changes
>> 
>> ### Unmount virtual thread while holding monitors
>> 
>> As stated in the JEP, currently when a virtual thread enters a synchronized 
>> method or block, the JVM records the virtual thread's carrier platform 
>> thread as holding the monitor, not the virtual thread itself. This prevents 
>> the virtual thread from being unmounted from its carrier, as ownership 
>> information would otherwise go wrong. In order to fix this limitation we 
>> will do two things:
>> 
>> - We copy the oops stored in the LockStack of the carrier to the stackChunk 
>> when freezing (and clear the LockStack). We copy the oops back to the 
>> LockStack of the next carrier when thawing for the first time (and clear 
>> them from the stackChunk). Note that we currently assume carriers don't hold 
>> monitors while mounting virtual threads.
>> 
>> - For inflated monitors we now record the `java.lang.Thread.tid` of the 
>> owner in the ObjectMonitor's `_owner` field instead of a JavaThread*. This 
>> allows us to tie the owner of the monitor to a `java.lang.Thread` instance, 
>> rather than to a JavaThread which is only created per platform thread. The 
>> tid is already a 64 bit field so we can ignore issues of the counter 
>> wrapping around.
>> 
>> #### General notes about this part:
>> 
>> - Since virtual th...
>
> src/hotspot/cpu/aarch64/continuationFreezeThaw_aarch64.inline.hpp line 135:
> 
>> 133:   assert(*f.addr_at(frame::interpreter_frame_last_sp_offset) == 0, 
>> "should be null for top frame");
>> 134:   intptr_t* lspp = f.addr_at(frame::interpreter_frame_last_sp_offset);
>> 135:   *lspp = f.unextended_sp() - f.fp();
> 
> Can you write a comment what this is doing briefly and why?

Added comment.

> src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1550:
> 
>> 1548: #endif /* ASSERT */
>> 1549: 
>> 1550:   push_cont_fastpath();
> 
> One of the callers of this gives a clue what it does.
> 
>   __ push_cont_fastpath(); // Set JavaThread::_cont_fastpath to the sp of the 
> oldest interpreted frame we know about
> 
> Why do you do this here?  Oh please more comments...

_cont_fastpath is what we check in freeze_internal to decide if we can take the 
fast path. Since we are calling from the interpreter we have to take the slow 
path. Added a comment.

> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5354:
> 
>> 5352:   str(rscratch2, dst);
>> 5353:   Label ok;
>> 5354:   tbz(rscratch2, 63, ok);
> 
> 63?  Does this really need to have underflow checking?  That would alleviate 
> the register use concerns if it didn't.  And it's only for legacy locking 
> which should be stable until it's removed.

I can remove the check. I don't think it hurts either though. Also we can 
actually just use rscratch1 in the ASSERT case.

> src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 2032:
> 
>> 2030:     // Force freeze slow path in case we try to preempt. We will pin 
>> the
>> 2031:     // vthread to the carrier (see 
>> FreezeBase::recurse_freeze_native_frame()).
>> 2032:     __ push_cont_fastpath();
> 
> We need to do this because we might freeze, so JavaThread::_cont_fastpath 
> should be set in case we do?

Right. We want to take the slow path to find the compiled native wrapper frame 
and fail to freeze. Otherwise the fast path won't find it since we don't walk 
the stack.

> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 231:
> 
>> 229: 
>> 230: void MacroAssembler::inc_held_monitor_count(Register tmp) {
>> 231:   Address dst = Address(xthread, 
>> JavaThread::held_monitor_count_offset());
> 
> Address dst(xthread, JavaThread::held_monitor_count_offset());

Done.

> src/hotspot/share/interpreter/oopMapCache.cpp line 268:
> 
>> 266:   }
>> 267: 
>> 268:   int num_oops() { return _num_oops; }
> 
> I can't find what uses this from OopMapCacheEntry.

It's needed for verification in VerifyStackChunkFrameClosure. It's called in 
OopMapCacheEntry::fill_for_native(), and we get there from here: 
https://github.com/openjdk/jdk/blob/66d5385f8a1c84e73cdbf385239089a7a9932a9e/src/hotspot/cpu/x86/stackChunkFrameStream_x86.inline.hpp#L114

> src/hotspot/share/runtime/continuation.cpp line 89:
> 
>> 87:       // we would incorrectly throw it during the unmount logic in the 
>> carrier.
>> 88:       if (_target->has_async_exception_condition()) {
>> 89:         _failed = false;
> 
> This says "Don't" but then failed is false which doesn't make sense.  Should 
> it be true?

Yes, good catch.

> src/hotspot/share/runtime/continuationFreezeThaw.cpp line 1275:
> 
>> 1273: 
>> 1274:   if (caller.is_interpreted_frame()) {
>> 1275:     _total_align_size += frame::align_wiggle;
> 
> Please put a comment here about frame align-wiggle.

I removed this case since it can never happen. The caller has to be compiled, 
and we assert that at the beginning. This was a leftover from the forceful 
preemption at a safepoint work. I removed the similar code in 
recurse_thaw_stub_frame. I added a comment for the compiled and native cases 
though.

> src/hotspot/share/runtime/continuationFreezeThaw.cpp line 1278:
> 
>> 1276:   }
>> 1277: 
>> 1278:   patch(f, hf, caller, false /*is_bottom_frame*/);
> 
> I also forgot what patch does.  Can you add a comment here too?

I added a comment where it is defined since it is used in several places.

> src/hotspot/share/runtime/continuationFreezeThaw.cpp line 1550:
> 
>> 1548:   assert(!cont.is_empty(), "");
>> 1549:   // This is done for the sake of the enterSpecial frame
>> 1550:   StackWatermarkSet::after_unwind(thread);
> 
> Is there a new place for this StackWatermark code?

I removed it. We have already processed the enterSpecial frame as part of 
flush_stack_processing(), in fact we processed up to the caller of 
`Continuation.run()`.

> src/hotspot/share/runtime/continuationFreezeThaw.cpp line 2235:
> 
>> 2233:       assert(!mon_acquired || mon->has_owner(_thread), "invariant");
>> 2234:       if (!mon_acquired) {
>> 2235:         // Failed to aquire monitor. Return to enterSpecial to unmount 
>> again.
> 
> typo: acquire

Fixed.

> src/hotspot/share/runtime/continuationFreezeThaw.cpp line 2492:
> 
>> 2490: void ThawBase::throw_interrupted_exception(JavaThread* current, frame& 
>> top) {
>> 2491:   ContinuationWrapper::SafepointOp so(current, _cont);
>> 2492:   // Since we might safepoint set the anchor so that the stack can we 
>> walked.
> 
> typo: can be walked

Fixed.

> src/hotspot/share/runtime/javaThread.hpp line 334:
> 
>> 332:   bool                  _pending_jvmti_unmount_event;    // When 
>> preempting we post unmount event at unmount end rather than start
>> 333:   bool                  _on_monitor_waited_event;        // Avoid 
>> callee arg processing for enterSpecial when posting waited event
>> 334:   ObjectMonitor*        _contended_entered_monitor;      // Monitor por 
>> pending monitor_contended_entered callback
> 
> typo: Monitor **for** pending_contended_entered callback

Fixed.

> src/hotspot/share/runtime/objectMonitor.cpp line 876:
> 
>> 874:   // and in doing so avoid some transitions ...
>> 875: 
>> 876:   // For virtual threads that are pinned do a timed-park instead, to
> 
> I had trouble parsing this first sentence.  I think it needs a comma after 
> pinned and remove the comma after instead.

Fixed.

> src/hotspot/share/runtime/objectMonitor.cpp line 2305:
> 
>> 2303: }
>> 2304: 
>> 2305: void ObjectMonitor::Initialize2() {
> 
> Can you put a comment why there's a second initialize function?  Presumably 
> after some state is set.

Added comment.

> src/hotspot/share/runtime/objectMonitor.hpp line 43:
> 
>> 41: // ParkEvent instead.  Beware, however, that the JVMTI code
>> 42: // knows about ObjectWaiters, so we'll have to reconcile that code.
>> 43: // See next_waiter(), first_waiter(), etc.
> 
> Also a nice cleanup.  Did you reconcile the JVMTI code?

We didn't remove the ObjectWaiter. As for the presence of virtual threads in 
the list, we skip them in JVMTI get_object_monitor_usage. We already degraded 
virtual thread support for GetObjectMonitorUsage.

> src/hotspot/share/runtime/objectMonitor.hpp line 71:
> 
>> 69:   bool is_wait()           { return _is_wait; }
>> 70:   bool notified()          { return _notified; }
>> 71:   bool at_reenter()        { return _at_reenter; }
> 
> should these be const member functions?

Yes, changed to const.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816658344
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816660065
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1813516395
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816660542
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1813519648
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1819462987
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816660817
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816661388
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816661733
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816662247
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1823583906
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1823583954
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1823583822
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816662554
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816663065
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1819463651
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1819463958

Reply via email to