There is a subtle race in `UpcallLinker::on_exit` between copying of the old 
frame anchor back into place, and the GC. Since this copy is not atomic, it may 
briefly appear as if a thread has no last Java frame, while still in the 
`_thread_in_native` state, which leads to the GC skipping processing of any 
active Java frames.

This code was originally adapted from `JavaCallWrapper::!JavaCallWrapper` - the 
JNI mechanism for upcalls - but in that case the frame anchor copy happens in 
the `_thread_in_vm` state, which means the GC will wait for the thread to get 
to a safepoint.

The solution proposed here is to do the frame anchor copy in the java thread 
state, before transitioning  back to the native state. The java thread state, 
like the vm thread state, is also 'safe' i.e. the GC will wait for the thread 
to get to a safepoint, so we can safely do our non-atomic copy of the frame 
anchor.

Additionally, this PR resolves a similar issue in `on_entry`, by moving the 
clearing of the pending exception (in case native code use a JNI API and didn't 
handle the exception afterwards). We now also skip checking for async 
exceptions when transitioning from native to java, so we don't immediately 
clear them. Any async exceptions will be picked up at the next safepoint 
instead.

Special thanks to @stefank and @fisk for finding the root cause, and @jaikiran 
for testing and debugging.

Testing: tier 1-4, 20k runs of the failing test on linux-aarch64.

-------------

Commit messages:
 - Merge branch 'master' into SafeFrameAnchor
 - Merge branch 'master' into SafeFrameAnchor
 - Don't touch frame anchor or current exception oop in native state

Changes: https://git.openjdk.org/jdk/pull/21742/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21742&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8331735
  Stats: 15 lines in 1 file changed: 5 ins; 9 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/21742.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21742/head:pull/21742

PR: https://git.openjdk.org/jdk/pull/21742

Reply via email to