On Mon, 20 Jan 2025 07:56:49 GMT, Axel Boldt-Christmas <abold...@openjdk.org> 
wrote:

> The proposed change here is the following:
>  1. Move the `vmdependencies` list head from the `Context` to the `CallSite` 
> object
>  2. Remove the Context object and its corresponding cleaner
> 
> (1.) fixes the crash. (2.) is because without `vmdependencies` the Context 
> and its cleaner serves no purpose.
> 
> First what goes wrong without this patch.
> 
> Currently when the GC unlinks a nmethod it clean/flush out all the dependency 
> contexts. These are either attached to an InstanceKlass, or a Context object 
> reachable through a CallSite oop. A nmethod is unlinked when either one of 
> its oops have died or it is heuristically determined to be cold and should be 
> unloaded. So when the GC clean's the context through a CallSite oop, it may 
> be that that CallSite object is dead. 
> 
> For ZGC, which is a concurrent generational GC (the different generations are 
> collected concurrently, but in a coordinated manner), it is important that 
> the unlinking is coordinated with the reclamation of this dead object. In 
> generational ZGC all nmethod oops are considered as strong roots if they 
> reside in the young generation and thusly can only become unreachable / dead 
> after promotion to the old generation. This means that the CallSite object at 
> the time of unlinking is either reachable / live, or unreachable / dead and 
> is reclaimed by the old generation collection (the same generation that does 
> the unlinking). So we can make reading from this object safe by not 
> reclaiming the object before unlinking is finished. 
> 
> The issue is that we do not have this guarantee for the Context object. As 
> this is a distinct object it may be that it has not been promoted and resides 
> in the young generation at the time of its CallSite object becoming 
> unreachable and collected by the old generation collection.
> 
> If this is the case and a young generation collection runs after old marking 
> has finished, we have two bad scenarios. If it the young generation 
> collection starts after reference processing and the cleaner has run, the 
> Context object would be unreachable and the young generation collection  
> would reclaim the memory. If it started before the reference processing it 
> would still be reachable, but may be relocated. 
> 
> For reachable old CallSite objects the Context oop field would have been 
> tracked and remapped if a young collection relocates the Context object, 
> however this is not true then the CallSite is unreachable. The Context object 
> may have moved or been reclaimed, and the load barrier on the field will 
> produce ...

Good work, Axel. `Cleaner`-based solution to manage nmethod dependencies on 
`java.lang.invoke.CallSite`s does look redundant. If that's the case, then I'd 
expect that by the time cleanup action is invoked corresponding dependency list 
is empty. Is it the case in practice? 

But the original problem makes me wonder whether `Cleaner`-based approach to 
manage native resources is broken or not.
Dependent instance is used to avoid referent to be artificially kept alive by 
cleanup action. If it is not safe to access it in order to perform cleanup, how 
the whole approach is intended to work? Or does anything make 'CallSite` use 
case special?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23194#issuecomment-2617796038

Reply via email to