On Mon, 12 May 2025 23:19:58 GMT, Serguei Spitsyn <sspit...@openjdk.org> wrote:
> The tests `SuspendResume1`, `SuspendResume2` and `SuspendResumeAll` are > intermittently failed with a timeout (deadlock). The tests run with > `-Djdk.virtualThreadScheduler.maxPoolSize=1` so there is only one carrier. > The short sleep in `TestedThread.run` isn't sufficient to make progress. This > will happen if tasks pushed by the delayed scheduler are executing before the > tasks for the newly started virtual thread. FJP won't search other submission > queues until the queue it keeps going back to is empty or there is > contention. These deadlocks can be made better reproducible if the sleep in > `TestedThread.run` is made minimal (1 millisecond). > The fix is to increase the sleep to 50 milliseconds and also to decrease the > busy part of the busy loop. > > Testing: > - Mach5 test runs of the fixed tests Marked as reviewed by alanb (Reviewer). I see Fei Yang's comment confirming that this fixes the timeouts in their environment, that is useful to know. Main lesson here is that the virtual thread is not fair. A virtual thread doing short sleep, sleep(1) in one case here, may be continued and execute before other virtual threads that are queued to continue. ------------- PR Review: https://git.openjdk.org/jdk/pull/25194#pullrequestreview-2838801041 PR Comment: https://git.openjdk.org/jdk/pull/25194#issuecomment-2878748617