On Wed, 10 Jul 2024 17:10:46 GMT, Chris Plummer <cjplum...@openjdk.org> wrote:

>>> This does not look correct to me.
>>> This is the last test scenario - thread2.resume should resumes the thread 
>>> while vm is suspended.
>>> thread2 should not be blocked on main thread.
>>> Looking at the debuggee I suppose the blocking is possible during logging. 
>>> I'd suggest to update the debugee and remove any logging between 
>>> breakpoints 2 and 3
>> 
>> It looks like the debuggee gets as far as the following:
>> 
>> 
>>     public void runt2() {
>>         log("method 'runt2' enter 1");
>>         i1++;
>>         i2--;
>>         log("method 'run2t' exit");
>>         return;
>>     }
>> 
>> 
>> It prints the first log and hits a breakpoint setup on the 2nd line. The 
>> debugger resumes thread2 after this, but we never see the 2nd log. Whenever 
>> we see this failure, the following logs from the mainThread are always 
>> delayed (by a lot):
>> 
>> 
>> debugee.stderr> **> mainThread: mainThread is out of: synchronized 
>> (lockingObject) {
>> debugee.stderr> **> mainThread: waiting for an instruction from the debugger 
>> ...
>> 
>> 
>> I think this delay is resulting in the the mainThread being in the middle of 
>> doing one of these logs when the vm.suspend() is done. This leaves 
>> mainThread suspended while holding a lock needed for doing logging (logging 
>> is just a simple System.err.prinln()). I'm trying to prove this by getting a 
>> debuggee thread dump when getting the 3rd Breakpoint event times out, but 
>> for some reason once I added this code I could no longer reproduce the 
>> problem (still trying though).
>> 
>> I don't like the idea of avoiding this issue by getting rid of certain 
>> problematic logging. It seems error prone. Someone could add some new 
>> logging in the future. I'll see if there is a better solution than the 
>> vm.resume(). Perhaps I could just resume mainThread. However, I think with 
>> virtual threads I/O can be dependent on other threads like an "unparker" 
>> thread.
>> 
>> Another solution might be to have the debugger and debuggee do an additional 
>> handshake so we can guarantee that mainThread is done with these two log 
>> statements. Currently when we get to the 2nd log statement, that just means 
>> the debuggee is waiting for a "quit" command from the debugger. We could at 
>> this point have the debuggee send a command to the debugger, and have the 
>> debugger wait for this command before it does the vm.suspend().
>
> I was finally able to reproduce the issue with the stack dumping support in 
> place. mainThread is in the middle of printing the 1st of the two logs 
> mention above. mainThead is suspended and is holding a println related lock, 
> and thread2 is stuck on the 2nd log call in runt2 awaiting the same lock.

I was able to add synchronization between the debugger and debuggee to fix this 
issue, but I don't really like the solution. It just adds more complexity and 
makes the test even harder to follow. What I'd like to do is just put a short 
sleep in before the vm.suspend(). Let me know if you think this is ok and I'll 
update the PR with the changes.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20088#discussion_r1672976958

Reply via email to