On Thu, 19 Sep 2024 14:03:45 GMT, Kevin Rushforth <k...@openjdk.org> wrote:

> This PR fixes a hang on exit which can happen if QuantumToolkit.dispose hangs 
> when called from the QuantumToolkit shutdown hook. A shutdown hook should 
> never run indefinitely, so the fix is to call dispose from a background 
> thread and wait for up to 5 seconds for it to finish normally, and then throw 
> an Error so that the application will be able to exit anyway.
> 
> The QuantumToolkit registers a "Glass/Prism Shutdown Hook" that calls the 
> QuantumToolkit.dispose method to gracefully shutdown the toolkit and 
> renderer. If dispose hangs or deadlocks, this will prevent the application 
> from exiting, requiring the process to be killed. This has been observed on 
> macOS, due to [JDK-8238505](https://bugs.openjdk.org/browse/JDK-8238505), 
> which is is an intermittent deadlock that occurs during shutdown. We ought to 
> eventually fix that intermittent deadlock, but even in the presence of a 
> deadlock or hang, we don't want the shutdown hook to hang.
> 
> This can happen at any time, but is especially a problem during an automated 
> test run, which will then hang until the Jenkins job is either killed or its 
> global timer (currently set to 12 hours) kills the job. Even in that case, 
> processes can be left lying around which can affect subsequent runs.
> 
> I discovered this while testing 
> [JDK-8328629](https://bugs.openjdk.org/browse/JDK-8328629), which adds a 
> default Unit 5 timeout, and without fixing the deadlock, the timeout will be 
> unable to abort the hanging test.
> 
> ### Testing instructions
> 
> I tested this in connection with PR #1575 / 
> [JDK-8328629](https://bugs.openjdk.org/browse/JDK-8328629) using my 
> [test-timeout-and-shutdown-hang](https://github.com/kevinrushforth/jfx/tree/test-timeout-and-shutdown-hang)
>  branch. That branch has the fixes from both this PR and PR #1575 as well as 
> tests that will hang indefinitely without the fixes and fail due to the 
> expected timeout with the fixes.

Marked as reviewed by jvos (Reviewer).

This fix looks good. While we should avoid that the QuantumToolkit.dispose 
hangs, it is indeed good practice to take into account the (rare) possibility 
that it hangs. In this case, there is not much a developer or user can do, so 
the message returned with the InternalError doesn't matter to me.
Regardless of this fix, it would still be good to have a fix for 
[JDK-8238505](https://bugs.openjdk.org/browse/JDK-8238505) (although that is 
likely less urgent now that the CI blocking is tackled) -- and there might be 
other issues that cause the dispose thread to not finish.

-------------

PR Review: https://git.openjdk.org/jfx/pull/1574#pullrequestreview-2318242090
PR Comment: https://git.openjdk.org/jfx/pull/1574#issuecomment-2363704561

Reply via email to