On Thu, 15 Aug 2024 14:32:40 GMT, Alan Bateman <al...@openjdk.org> wrote:

>> Would it be possible to create a boolean in the EventWriter that indicates 
>> if it is associated with a carrier thread or a normal thread (which can 
>> never be virtual) and then have two methods.
>> 
>>     long l = this.carrierThread ? StringPool.addPinnedString(s) : 
>> StringPool.addString(s);
>
> Thread.currentThread() has an intrinsic, and isVirtual is just a type check. 
> ContinuationSupport.isSupported reads a static final so will disappear once 
> compiled. The pattern we are using in other areas is for the pin to return a 
> boolean (like David suggested).

I looked into this in more detail. The current suggestion:

mov    r10,QWORD PTR [r15+0x388]  ; _vthread OopHandle
mov    r10,QWORD PTR [r10]              ; dereference OopHandle <<-- 
Thread.currentThread() intrinsic gives 2 instructions
mov    r11d,DWORD PTR [r10+0x8]    ; InstanceKlass to r11 <-- isVirtual()
mov    r10d,r11d                                 ; InstanceKlass to r10
mov    r8,QWORD PTR [r10+0x40]      ; Load slot in InstanceKlass primary supers 
array to r8
movabs r10,0x2d0481a8                     ; InstanceKlass for 
{metadata('java/lang/BaseVirtualThread')} to r10
cmp    r8,r10                                       ; compare if superklass is 
java/lang/BaseVirtualThread
jne    0x0000018571e0baf9                ; 6 instructions for isVirtual() type 
check, 8 instructions in total

This gives a prologue of eight instructions.

For JFR, we already have much of this information resolved when loading up the 
EventWriter instance using the existing intrinsic getEventWriter(). Hence, we 
could extend that to mark the event writer with a field to say if pinning 
should be performed. This results in only a two instruction prologue:

test   r8d,r8d                         ; pinVirtualThread? 
je     0x0000012580a0f6c9    ; 2 instructions for test

This is an x4 speedup, although slightly less because of an additional store 
instruction for loading the event writer.

Further, I looked into the Continuation.pin() and Continuation.unpin() methods. 
They are currently not intrinsics, but lend themselves well to 
intrinsification. I have created such intrinsics, and the results are quite 
good.

Continuation.pin() or Continuation.unpin() without intrinsics = 112 
instructions each
Continuation.pin() or Continuation.unpin() with intrinsics = 8 instructions each

This is an x14 speedup for virtual threads.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20588#discussion_r1725145256

Reply via email to