On Wed, 25 Jun 2025 13:02:03 GMT, Kevin Walls <kev...@openjdk.org> wrote:
>> ThreadDumper/ThreadSnapshot need to handle a failure to resolve the native >> VM JavaThread from a java.lang.Thread. This is hard to reproduce but a >> thread that has since terminated can provoke a crash. Recognise this and >> return a null ThreadSnapshot. > > Kevin Walls has updated the pull request incrementally with two additional > commits since the last revision: > > - comment update > - comment update I was reproducing this frequently, monitoring with asserts in a fastdebug build and problems started with ThreadSnapshotFactory::get_thread_snapshot() getting a null from JNIHandles::resolve(jthread) ...there are several different crashes in the product build. > But _thread_h() has already been used a number of times before we get here > and if it were null we should have crashed long ago. ??? There can be some that don't cause a problem, like: java_lang_VirtualThread::is_instance(_thread_h()); (includes null check) ..and others are not called. Hmm maybe there are some that look like they should have crashed, e.g. 1290 _thread_name = OopHandle(oop_storage(), java_lang_Thread::name(_thread_h())); <-- name does: return java_thread->obj_field(_name_offset); ...I don't see why this didn't fault in the report from the JBS issue I was interpreting here (not my debug build). Reordered or something else happened, or just haven't understood enough. It is much easier to read an assert in get_thread_snapshot than letting it continue and crash in vframestream etc... But null from JNIHandles::resolve(jthread) is the earliest problem I found. I'm redoing with the cv_internal_thread_to_JavaThread usage... A little concerned that ThreadsListHandle::cv_internal_thread_to_JavaThread takes jobject jthread, our ref to a java.lang.Thread, and uses also calls 811 oop thread_oop = JNIHandles::resolve_non_null(jthread); ...which asserts if contains null, but maybe I don't know all the ThreadsListHandle magic. I had a day yesterday where the problem would not reproduce at all, which made it hard to verify! Will update... ------------- PR Comment: https://git.openjdk.org/jdk/pull/25958#issuecomment-3012360012