On Wed, 15 Nov 2023 05:35:49 GMT, David Holmes <dhol...@openjdk.org> wrote:
>> Please, review this fix for a corner case handling of `jmethodID` values. >> >> The issue is related to the interplay between `jmethodID` values and method >> redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` >> instance. Once that method gets redefined, the `jmethodID` is updated to >> point to the last `Method` version. >> Unless the method is still on stack/running, in which case the original >> `jmethodID` will be redirected to the latest `Method` version and at the >> same time the 'previous' `Method` version will receive a new `jmethodID` >> pointing to that previous version. >> >> If we happen to capture stacktrace via `GetStackTrace` or >> `GetAllStackTraces` JVMTI calls while this previous `Method` version is >> still on stack we will have the corresponding frame identified by a >> `jmethodID` pointing to that version. >> However, sooner or later the 'previous' class version becomes eligible for >> cleanup at what time all contained `Method` instances. The cleanup process >> will not perform the `jmethodID` pointer maintenance and we will end up with >> pointers to deallocated memory. >> This is caused by the fact that the `jmethodID` lifecycle is bound to >> `ClassLoaderData` instance and all relevant `jmethodID`s will get >> batch-updated when the class loader is being released and all its classes >> are getting unloaded. >> >> This means that we need to make sure that if a `Method` instance is being >> deallocate the associated `jmethodID` (if any) must not point to the >> deallocated instance once we are finished. Unfortunately, we can not just >> update the `jmethodID` values in bulk when purging an old class version - >> the per `InstanceKlass` jmethodID cache is present only for the main class >> version and contains `jmethodID` values for both the old and current method >> versions. >> >> Therefore we need to perform `jmethodID` lookup when we are about to >> deallocate a `Method` instance and clean up the pointer only if that >> `jmethodID` is pointing to the `Method` instance which is being deallocated. >> >> _(For anyone interested, a much lengthier writeup is available in [my >> blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ > > src/hotspot/share/oops/instanceKlass.cpp line 531: > >> 529: >> 530: void InstanceKlass::deallocate_methods(ClassLoaderData* loader_data, >> 531: Array<Method*>* methods, >> InstanceKlass* klass) { > > An explicit boolean parameter would be cleaner/clearer. I just removed the `klass` argument. It is not really used anyway. > src/hotspot/share/oops/instanceKlass.cpp line 542: > >> 540: if (klass) { >> 541: jmethodID jmid = method->find_jmethod_id_or_null(); >> 542: // Do the pointer maintenance before releasing the metadata, >> just in case > > I assume there should be a period after 'case`. But just in case of what? The code was moved to`method.cpp` and this particular comment line became obsolete > src/hotspot/share/oops/instanceKlass.cpp line 549: > >> 547: if (jmid != nullptr && *((Method**)jmid) == method) { >> 548: *((Method**)jmid) = nullptr; >> 549: } > > This should be abstracted behind a utility function e.g. > `method->clear_jmethod_id()`. Done > src/hotspot/share/oops/method.cpp line 2277: > >> 2275: } >> 2276: } >> 2277: > > Can this race with redefinition? The cleanup of previous versions is executed in VM_Operation at a safepoint - therefore we should be safe against races with class redefinitions. I am adding an assert to `clear_jmethod_id()` to check for being at a safepoint. > src/hotspot/share/oops/method.hpp line 730: > >> 728: // so handles are not used to avoid deadlock. >> 729: jmethodID find_jmethod_id_or_null() { >> 730: return method_holder() != nullptr ? >> method_holder()->jmethod_id_or_null(this) : nullptr; > > If `method_holder()` is null at this point what does that mean for the > lifecycle of the Method? Please, ignore this part of code for the time being. I had a crash in CI which was pointing vaguely to this code - unfortunately, the hs_err.log files are not preserved in the test archives and I am not able to reproduce the failure locally. I need to debug the crash and make sure I understand the root cause. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1394642247 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1394647034 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1394647173 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1395100685 PR Review Comment: https://git.openjdk.org/jdk/pull/16662#discussion_r1395102362