On Mon, 1 Aug 2022 04:58:49 GMT, Axel Boldt-Christmas <d...@openjdk.org> wrote:
>> The proposal is to encapsulate the nmethod mark for deoptimization logic in >> one place and only allow access to the `mark_for_deoptimization` from a >> closure object: >> ```C++ >> class DeoptimizationMarkerClosure : StackObj { >> public: >> virtual void marker_do(Deoptimization::MarkFn mark_fn) = 0; >> }; >> >> This closure takes a `MarkFn` which it uses to mark which nmethods should be >> deoptimized. This marking can only be done through the `MarkFn` and a >> `MarkFn` can only be created in the following code which runs the closure. >> ```C++ >> { >> NoSafepointVerifier nsv; >> assert_locked_or_safepoint(Compile_lock); >> marker_closure.marker_do(MarkFn()); >> anything_deoptimized = deoptimize_all_marked(); >> } >> if (anything_deoptimized) { >> run_deoptimize_closure(); >> } >> >> This ensures that this logic is encapsulated and the `NoSafepointVerifier` >> and `assert_locked_or_safepoint(Compile_lock)` makes `deoptimize_all_marked` >> not having to scan the whole code cache sound. >> >> The exception to this pattern, from `InstanceKlass::unload_class`, is >> discussed in the JBS issue, and gives reasons why not marking for >> deoptimization there is ok. >> >> An effect of this encapsulation is that the deoptimization logic was moved >> from the `CodeCache` class to the `Deoptimization` class and the class >> redefinition logic was moved from the `CodeCache` class to the >> `VM_RedefineClasses` class/operation. >> >> Testing: Tier 1-5 >> >> _Update_ >> --- >> Switched too using a RAII object to track the context instead of putting >> code in a closure. But all the encapsulation is still the same. >> >> Testing: Tier 1-7 >> >> _Update_ >> --- >>> @stefank suggested splitting out unloading klass logic change into a >>> separate issue [JDK-8291718](https://bugs.openjdk.org/browse/JDK-8291718). >>> >>> Will probably also limit this PR to only encapsulation. (Skipping the >>> linked list optimisation) And create a separate issue for that as well. >>> >>> But this creates a chain of three dependent issues. >>> [JDK-8291237](https://bugs.openjdk.org/browse/JDK-8291237) depends on >>> [JDK-8291718](https://bugs.openjdk.org/browse/JDK-8291718). And the link >>> list optimisation depend will depend on >>> [JDK-8291237](https://bugs.openjdk.org/browse/JDK-8291237). >>> >>> Will mark this as a draft for now and create a PR for >>> [JDK-8291718](https://bugs.openjdk.org/browse/JDK-8291718) first. > > Axel Boldt-Christmas has updated the pull request incrementally with three > additional commits since the last revision: > > - Add assertions > - Fix marked logic > - Erik refactorings As my previous comment will probably back out the linked list of this PR and create a separate issue that depends on this encapsulation that only deal with swapping to a linked list instead of walking the whole code cache. But as it will be based on this code your question is still relevant. I am pretty new to the codebase so please correct me if I am wrong, especially with regards to different locks, their scope and usage. Here is my understanding of why the linked list is valid as is. While working with the list we have a few things which are true: * `assert_locked_or_safepoint(Compile_lock);` anywhere were we modify the list, * once something is linked it cannot be linked again, as it is only linked if it is `not_marked` and the status can only go `not_marked -> deoptimize | deoptimize_noupdate -> deoptimize_done`. * `assert(extract_compiled_method(_mark_link) == nullptr, ...)` for some extra sanity around this * While the `MarkForDeoptimizationStatus` value makes linking something already linked impossible the assert above is not enough to catch if a tail of the list is linked in creating a cycle. Can either add an assert that walks the list and checks that the `tail != this`, or the `next_marked()` code can be changed to `take_next_marked()` which sets the field to `nullptr` and thus breaks any cycles when iterating (avoids infinite loops). The second would be alright as creating a loop from the tail to the root will never drop parts of the list, even if more elements are added after the cycle is created, as the `assert(extract_compiled_method(_mark_link) == nullptr, ...)` assert that the list will not break. * From the creation of a `DeoptimizationContext` to until the linked list has been processed (`make_not_entrant` and `make_deoptimized`) we have `NoSafepointVerifier` (which is the time to safe point issue this change is trying to address) * And this active part of the DeoptimizationContext cannot overlap with another active part of another DeoptimizationContext, checked via a bool flag. Maybe it is not correct to have a non volatile static bool here, but I though `assert_locked_or_safepoint(Compile_lock);` would be enough to guarantee synchronisation. But should probably change to load acquire atomics. I think most of the terminology will change to make it more deoptimization specific. Having it called `not_marked`, `_root_mark_link`, `_mark_link`, `next_mark()` and `take_root()` is to general and can lead to confusion. Another reason to split this out to a separate issue. Also using some other terminology than `mark` could be used. ------------- PR: https://git.openjdk.org/jdk/pull/9655