[ https://issues.apache.org/jira/browse/FLINK-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15859733#comment-15859733 ]
ASF GitHub Bot commented on FLINK-5759: --------------------------------------- GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/3290 [FLINK-5759] [jobmanager] Set UncaughtExceptionHandlers for JobManager's Future and I/O thread pools Currently, the thread pools of the `JobManager` do not have any `UncaughtExceptionHandler`. While uncaught exceptions are rare (Flink handles exceptions aggressively in most places), when exceptions slip through in these threads (which execute future responses and delayed actions), the `JobManager` may be in an inconsistent state and not function properly any more. This pull request adds a handler that results in a process kill in the case of uncaught exceptions. Letting the JobManager be restarted by the respective cluster framework is the only guaranteed way to be safe. This also unifies the `ExecutorThreadFactory` and `NamedThreadFactory`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink uncaught_handlers Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3290.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3290 ---- commit 3602631353dbdf230044db7fba1890600e648101 Author: Stephan Ewen <se...@apache.org> Date: 2017-02-09T13:04:17Z [FLINK-5759] [jobmanager] Set UncaughtExceptionHandlers for JobManager's Future and I/O thread pools ---- > Set an UncaughtExceptionHandler for all Thread Pools in JobManager > ------------------------------------------------------------------ > > Key: FLINK-5759 > URL: https://issues.apache.org/jira/browse/FLINK-5759 > Project: Flink > Issue Type: Bug > Components: JobManager > Affects Versions: 1.2.0 > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Fix For: 1.3.0 > > > Currently, the thread pools of the {{JobManager}} do not have any > {{UncaughtExceptionHandler}}. > While uncaught exceptions are rare (Flink handles exceptions aggressively in > most places), when exceptions slip through in these threads (which execute > future responses and delayed actions), the JobManager may be in an > inconsistent state and not function properly any more. > We should add a handler that results in a process kill in the case of > uncaught exceptions. Letting the JobManager be restarted by the respective > cluster framework is the only guaranteed way to be safe. -- This message was sent by Atlassian JIRA (v6.3.15#6346)