On Tue, 11 Apr 2023 20:40:14 GMT, Dean Long <dl...@openjdk.org> wrote:

>> Yes. The whole LockStack is not meant to be accessed cross-thread, pretty 
>> much like any thread's stack is not meant to be accessed like that 
>> (including current stack-locking). So what can go wrong?
>> With the new locking, we could read junk and compare it to the oop that 
>> we're testing against and get a wrong result. We're not going to crash 
>> though.
>> With the current stack-locking, we would fetch the stack-pointer and check 
>> if that address is within the foreign thread's stack. Again, because the 
>> other thread is not holding still, we might get a wrong result, but we would 
>> not crash.
>> So I guess we need to answer the question whether or not jmm_GetThreadInfo() 
>> is ok with returning wrong result and what could be the consequences of 
>> this. For example, how important is it that the info about the thread(s) is 
>> correct and consistent (e.g. what happens if we report two threads both 
>> holding the same lock?), etc. But I don't consider this to be part of this 
>> PR.
>> 
>> So my proposal is: leave that code as it is, for now (being racy when 
>> inspecting foreign threads, but don't crash). Open a new issue to 
>> investigate and possibly fix the problem (maybe by safepointing, maybe by 
>> handshaking if that is enough, or maybe we find out we don't need to do 
>> anything). Add comments in relevant places to point out the problem like you 
>> and David suggested earlier. Would that be ok?
>
> That seems fine to me, as long as we don't crash.  But my understanding is 
> that Generational ZGC will crash if it sees a stale oop.  Isn't it possible 
> that the racing read sees junk that looks to Generational ZGC like a stale 
> oop?  To avoid this, unused slots may need to be set to nullptr even in 
> product builds.  But I'm not a GC expert so maybe there's no problem.

The old code is "racy but safe - it basically answers the question "what thread 
held the lock at the time I was asking?" and if we get a stack-addr as the 
owner at the time we ask, and that stack-address belongs to a given thread t 
then we report t as the owner. The fact t may have released the lock as soon as 
we read the stack-addr is immaterial.

The new code may be a different matter however. Now the race involves oops, and 
potentially stale ones IIUC what Stefan is saying. So now the race is not safe, 
and potentially may crash.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/10907#discussion_r1163491093

Reply via email to