On Sun, 29 Sep 2024 22:22:30 GMT, David Holmes <dhol...@openjdk.org> wrote:
>> When investigating an intermittent NPE with an Oracle internal test on >> AArch64, I noticed that the NPE is actually a SIGSEGV in the code emitted by >> `MethodHandles::jump_to_lambda_form` when trying to load the >> `MemberName::method` field because the `MemberName` we retrieved from >> `LambdaForm::vmentry` is null: >> >> https://github.com/openjdk/jdk/blob/f0374a0bc181d0f2a8c0aa9aa032b07998ffaf60/src/hotspot/cpu/aarch64/methodHandles_aarch64.cpp#L141-L143 >> >> The JVM translates SIGSEGVs happening in method handle intrinsics to NPEs, >> assuming that they are implicit NPEs due to a null receiver: >> https://github.com/openjdk/jdk/blob/f0374a0bc181d0f2a8c0aa9aa032b07998ffaf60/src/hotspot/share/runtime/sharedRuntime.cpp#L979-L982 >> >> After digging around in the MethodHandle implementation that I'm not too >> familiar with, I found this suspicious code in `MethodHandle::updateForm`: >> https://github.com/openjdk/jdk/blob/36314a90c15e2ab2a9b32c2e471655c1b07d452c/src/java.base/share/classes/java/lang/invoke/MethodHandle.java#L1881-L1883 >> >> The `Unsafe.fullFence` was added by >> [JDK-8059877](https://bugs.openjdk.org/browse/JDK-8059877) and replaced a >> `// ISSUE: Should we have a memory fence here?` >> [comment](https://github.com/openjdk/jdk/commit/224c42ee4d4c3027d1f8f0d8b7ecf6646e9418c3#diff-5a4b2f817a4273eacf07f3ee24782c58c8ff474c6d790f72298e906837c5543aL1442). >> If the intention was to [prevent publishing a partially initialized >> object](https://wiki.sei.cmu.edu/confluence/display/java/TSM03-J.+Do+not+publish+partially+initialized+objects), >> I think it was added to the wrong place (paging @iwanowww). >> >> In the failing scenario, we concurrently trigger LambdaForm customization >> for a VarHandle invoker: >> >> at >> java.base/java.lang.invoke.MethodHandle.updateForm(MethodHandle.java:1883) >> at >> java.base/java.lang.invoke.MethodHandle.customize(MethodHandle.java:1856) >> at >> java.base/java.lang.invoke.MethodHandle.maybeCustomize(MethodHandle.java:1846) >> at java.base/java.lang.invoke.Invokers.maybeCustomize(Invokers.java:632) >> at >> java.base/java.lang.invoke.Invokers.checkCustomized(Invokers.java:626) >> >> The problem is that another thread can observe `newForm` which via the >> `MethodHandle::form` field before the store to `vmentry` in `prepare()` (see >> [here](https://github.com/openjdk/jdk/blob/36314a90c15e2ab2a9b32c2e471655c1b07d452c/src/java.base/share/classes/java/lang/invoke/LambdaForm.java#L821)) >> is visible and therefore observe null. We need a StoreStore barrier to >> prevent store... > > src/java.base/share/classes/java/lang/invoke/MethodHandle.java line 1882: > >> 1880: assert (newForm.customized == null || >> newForm.customized == this); >> 1881: newForm.prepare(); // as in MethodHandle.<init> >> 1882: UNSAFE.putReferenceRelease(this, FORM_OFFSET, >> newForm); // properly publish newForm > > A full-fence is a one-sided global synchronization action. A "release" store > is one side of a two-sided synchronization action: where is the > "load-acquire" that this pairs with? The `form` field is a final field; thus, all reads to that field is a load-acquire. This load-load barrier is critical to ensure observing the correct, non-null `vmentry` field value if the updated form is observed. Note that JIT compilation may constant-fold the preexisting form and ignore the updated form. This is still behaviorally identical, but usually the new form is more suitable for constant folding if JIT compilation can pick it up. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/21160#discussion_r1780194398