Re: RFR: 8291237: Encapsulate nmethod Deoptimization logic [v2]

Axel Boldt-Christmas Wed, 03 Aug 2022 06:40:27 -0700

On Mon, 1 Aug 2022 04:58:49 GMT, Axel Boldt-Christmas <[email protected]> wrote:


>> The proposal is to encapsulate the nmethod mark for deoptimization logic in 
>> one place and only allow access to the `mark_for_deoptimization` from a 
>> closure object:
>> ```C++
>> class DeoptimizationMarkerClosure : StackObj {
>> public:
>>   virtual void marker_do(Deoptimization::MarkFn mark_fn) = 0;
>> };
>> 
>> This closure takes a `MarkFn` which it uses to mark which nmethods should be 
>> deoptimized. This marking can only be done through the `MarkFn` and a 
>> `MarkFn` can only be created in the following code which runs the closure. 
>> ```C++
>> {
>>   NoSafepointVerifier nsv;
>>   assert_locked_or_safepoint(Compile_lock);
>>   marker_closure.marker_do(MarkFn());
>>   anything_deoptimized = deoptimize_all_marked();
>> }
>> if (anything_deoptimized) {
>>   run_deoptimize_closure();
>> }
>> 
>> This ensures that this logic is encapsulated and the `NoSafepointVerifier` 
>> and `assert_locked_or_safepoint(Compile_lock)` makes `deoptimize_all_marked` 
>> not having to scan the whole code cache sound.
>> 
>> The exception to this pattern, from `InstanceKlass::unload_class`, is 
>> discussed in the JBS issue, and gives reasons why not marking for 
>> deoptimization there is ok.
>> 
>> An effect of this encapsulation is that the deoptimization logic was moved 
>> from the `CodeCache` class to the `Deoptimization` class and the class 
>> redefinition logic was moved from the `CodeCache` class to the 
>> `VM_RedefineClasses` class/operation. 
>> 
>> Testing: Tier 1-5
>> 
>> _Update_
>> ---
>> Switched too using a RAII object to track the context instead of putting 
>> code in a closure. But all the encapsulation is still the same. 
>> 
>> Testing: Tier 1-7
>> 
>> _Update_
>> ---
>>> @stefank suggested splitting out unloading klass logic change into a 
>>> separate issue [JDK-8291718](https://bugs.openjdk.org/browse/JDK-8291718).
>>> 
>>> Will probably also limit this PR to only encapsulation. (Skipping the 
>>> linked list optimisation) And create a separate issue for that as well.
>>> 
>>> But this creates a chain of three dependent issues. 
>>> [JDK-8291237](https://bugs.openjdk.org/browse/JDK-8291237) depends on 
>>> [JDK-8291718](https://bugs.openjdk.org/browse/JDK-8291718). And the link 
>>> list optimisation depend will depend on 
>>> [JDK-8291237](https://bugs.openjdk.org/browse/JDK-8291237).
>>> 
>>> Will mark this as a draft for now and create a PR for 
>>> [JDK-8291718](https://bugs.openjdk.org/browse/JDK-8291718) first.
>
> Axel Boldt-Christmas has updated the pull request incrementally with three 
> additional commits since the last revision:
> 
>  - Add assertions
>  - Fix marked logic
>  - Erik refactorings

As my previous comment will probably back out the linked list of this PR and 
create a separate issue that depends on this encapsulation that only deal with 
swapping to a linked list instead of walking the whole code cache.

But as it will be based on this code your question is still relevant. 

I am pretty new to the codebase so please correct me if I am wrong, especially 
with regards to different locks, their scope and usage. Here is my 
understanding of why the linked list is valid as is. 

While working with the list we have a few things which are true:
* `assert_locked_or_safepoint(Compile_lock);` anywhere were we modify the list, 
*  once something is linked it cannot be linked again, as it is only linked if 
it is `not_marked` and the status can only go  `not_marked -> deoptimize | 
deoptimize_noupdate -> deoptimize_done`. 
    *  `assert(extract_compiled_method(_mark_link) == nullptr, ...)` for some 
extra sanity around this
    *  While the `MarkForDeoptimizationStatus` value makes linking something 
already linked impossible the assert above is not enough to catch if a tail of 
the list is linked in creating a cycle. Can either add an assert that walks the 
list and checks that the `tail != this`, or the `next_marked()` code can be 
changed to `take_next_marked()` which sets the field to `nullptr` and thus 
breaks any cycles when iterating (avoids infinite loops). The second would be 
alright as creating a loop from the tail to the root will never drop parts of 
the list, even if more elements are added after the cycle is created, as the  
`assert(extract_compiled_method(_mark_link) == nullptr, ...)`  assert that the 
list will not break.
* From the creation of a `DeoptimizationContext` to until the linked list has 
been processed (`make_not_entrant` and `make_deoptimized`) we have 
`NoSafepointVerifier` (which is the time to safe point issue this change is 
trying to address)
* And this active part of the DeoptimizationContext cannot overlap with another 
active part of another DeoptimizationContext, checked via a bool flag. Maybe it 
is not correct to have a non volatile static bool here, but I though 
`assert_locked_or_safepoint(Compile_lock);` would be enough to guarantee 
synchronisation. But should probably change to load acquire atomics.

I think most of the terminology will change to make it more deoptimization 
specific. Having it called `not_marked`, `_root_mark_link`,  `_mark_link`, 
`next_mark()` and `take_root()` is to general and can lead to confusion. 
Another reason to split this out to a separate issue. Also using some other 
terminology than `mark` could be used.

-------------

PR: https://git.openjdk.org/jdk/pull/9655

Re: RFR: 8291237: Encapsulate nmethod Deoptimization logic [v2]

Reply via email to