On Mon, 20 Jan 2025 07:56:49 GMT, Axel Boldt-Christmas <abold...@openjdk.org> wrote:
> The proposed change here is the following: > 1. Move the `vmdependencies` list head from the `Context` to the `CallSite` > object > 2. Remove the Context object and its corresponding cleaner > > (1.) fixes the crash. (2.) is because without `vmdependencies` the Context > and its cleaner serves no purpose. > > First what goes wrong without this patch. > > Currently when the GC unlinks a nmethod it clean/flush out all the dependency > contexts. These are either attached to an InstanceKlass, or a Context object > reachable through a CallSite oop. A nmethod is unlinked when either one of > its oops have died or it is heuristically determined to be cold and should be > unloaded. So when the GC clean's the context through a CallSite oop, it may > be that that CallSite object is dead. > > For ZGC, which is a concurrent generational GC (the different generations are > collected concurrently, but in a coordinated manner), it is important that > the unlinking is coordinated with the reclamation of this dead object. In > generational ZGC all nmethod oops are considered as strong roots if they > reside in the young generation and thusly can only become unreachable / dead > after promotion to the old generation. This means that the CallSite object at > the time of unlinking is either reachable / live, or unreachable / dead and > is reclaimed by the old generation collection (the same generation that does > the unlinking). So we can make reading from this object safe by not > reclaiming the object before unlinking is finished. > > The issue is that we do not have this guarantee for the Context object. As > this is a distinct object it may be that it has not been promoted and resides > in the young generation at the time of its CallSite object becoming > unreachable and collected by the old generation collection. > > If this is the case and a young generation collection runs after old marking > has finished, we have two bad scenarios. If it the young generation > collection starts after reference processing and the cleaner has run, the > Context object would be unreachable and the young generation collection > would reclaim the memory. If it started before the reference processing it > would still be reachable, but may be relocated. > > For reachable old CallSite objects the Context oop field would have been > tracked and remapped if a young collection relocates the Context object, > however this is not true then the CallSite is unreachable. The Context object > may have moved or been reclaimed, and the load barrier on the field will > produce ... Ran tier1 through tier8 with G1 and the guarantee that the dependency list is empty, and it is indeed the case. Thanks for all the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23194#issuecomment-2623840786