[ https://issues.apache.org/jira/browse/FLINK-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861272#comment-15861272 ]
ASF GitHub Bot commented on FLINK-5759: --------------------------------------- Github user StephanEwen commented on a diff in the pull request: https://github.com/apache/flink/pull/3290#discussion_r100535279 --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/util/ExecutorThreadFactory.java --- @@ -18,49 +18,112 @@ package org.apache.flink.runtime.util; +import java.lang.Thread.UncaughtExceptionHandler; import java.util.concurrent.ThreadFactory; import java.util.concurrent.atomic.AtomicInteger; import org.slf4j.Logger; import org.slf4j.LoggerFactory; +import static org.apache.flink.util.Preconditions.checkNotNull; + +/** + * A thread --- End diff -- True, incomplete, will fix that. > Set an UncaughtExceptionHandler for all Thread Pools in JobManager > ------------------------------------------------------------------ > > Key: FLINK-5759 > URL: https://issues.apache.org/jira/browse/FLINK-5759 > Project: Flink > Issue Type: Bug > Components: JobManager > Affects Versions: 1.2.0 > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Fix For: 1.3.0 > > > Currently, the thread pools of the {{JobManager}} do not have any > {{UncaughtExceptionHandler}}. > While uncaught exceptions are rare (Flink handles exceptions aggressively in > most places), when exceptions slip through in these threads (which execute > future responses and delayed actions), the JobManager may be in an > inconsistent state and not function properly any more. > We should add a handler that results in a process kill in the case of > uncaught exceptions. Letting the JobManager be restarted by the respective > cluster framework is the only guaranteed way to be safe. -- This message was sent by Atlassian JIRA (v6.3.15#6346)