On Tue, 11 Apr 2023 19:58:19 GMT, Roman Kennke <rken...@openjdk.org> wrote:
>> The `_base` array is only initialized to nullptr in debug builds. I don't >> see a release barrier in LockStack::push between the update to _base[] and >> the update to _top, nor a corresponding acquire barrier when reading. >> Doesn't this mean it is possible to racily read an uninitialized junk oop >> value from _base[], especially on weak memory models? > > Yes. The whole LockStack is not meant to be accessed cross-thread, pretty > much like any thread's stack is not meant to be accessed like that (including > current stack-locking). So what can go wrong? > With the new locking, we could read junk and compare it to the oop that we're > testing against and get a wrong result. We're not going to crash though. > With the current stack-locking, we would fetch the stack-pointer and check if > that address is within the foreign thread's stack. Again, because the other > thread is not holding still, we might get a wrong result, but we would not > crash. > So I guess we need to answer the question whether or not jmm_GetThreadInfo() > is ok with returning wrong result and what could be the consequences of this. > For example, how important is it that the info about the thread(s) is correct > and consistent (e.g. what happens if we report two threads both holding the > same lock?), etc. But I don't consider this to be part of this PR. > > So my proposal is: leave that code as it is, for now (being racy when > inspecting foreign threads, but don't crash). Open a new issue to investigate > and possibly fix the problem (maybe by safepointing, maybe by handshaking if > that is enough, or maybe we find out we don't need to do anything). Add > comments in relevant places to point out the problem like you and David > suggested earlier. Would that be ok? That seems fine to me, as long as we don't crash. But my understanding is that Generational ZGC will crash if it sees a stale oop. Isn't it possible that the racing read sees junk that looks to Generational ZGC like a stale oop? To avoid this, unused slots may need to be set to nullptr even in product builds. But I'm not a GC expert so maybe there's no problem. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1163306288