On Tue, 11 Apr 2023 20:40:14 GMT, Dean Long <dl...@openjdk.org> wrote:
>> Yes. The whole LockStack is not meant to be accessed cross-thread, pretty >> much like any thread's stack is not meant to be accessed like that >> (including current stack-locking). So what can go wrong? >> With the new locking, we could read junk and compare it to the oop that >> we're testing against and get a wrong result. We're not going to crash >> though. >> With the current stack-locking, we would fetch the stack-pointer and check >> if that address is within the foreign thread's stack. Again, because the >> other thread is not holding still, we might get a wrong result, but we would >> not crash. >> So I guess we need to answer the question whether or not jmm_GetThreadInfo() >> is ok with returning wrong result and what could be the consequences of >> this. For example, how important is it that the info about the thread(s) is >> correct and consistent (e.g. what happens if we report two threads both >> holding the same lock?), etc. But I don't consider this to be part of this >> PR. >> >> So my proposal is: leave that code as it is, for now (being racy when >> inspecting foreign threads, but don't crash). Open a new issue to >> investigate and possibly fix the problem (maybe by safepointing, maybe by >> handshaking if that is enough, or maybe we find out we don't need to do >> anything). Add comments in relevant places to point out the problem like you >> and David suggested earlier. Would that be ok? > > That seems fine to me, as long as we don't crash. But my understanding is > that Generational ZGC will crash if it sees a stale oop. Isn't it possible > that the racing read sees junk that looks to Generational ZGC like a stale > oop? To avoid this, unused slots may need to be set to nullptr even in > product builds. But I'm not a GC expert so maybe there's no problem. The old code is "racy but safe - it basically answers the question "what thread held the lock at the time I was asking?" and if we get a stack-addr as the owner at the time we ask, and that stack-address belongs to a given thread t then we report t as the owner. The fact t may have released the lock as soon as we read the stack-addr is immaterial. The new code may be a different matter however. Now the race involves oops, and potentially stale ones IIUC what Stefan is saying. So now the race is not safe, and potentially may crash. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1163491093