[jira] [Commented] (FLINK-5759) Set an UncaughtExceptionHandler for all Thread Pools in JobManager

ASF GitHub Bot (JIRA) Fri, 10 Feb 2017 05:38:05 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861272#comment-15861272
 ]


ASF GitHub Bot commented on FLINK-5759:
---------------------------------------

Github user StephanEwen commented on a diff in the pull request:

    https://github.com/apache/flink/pull/3290#discussion_r100535279
  
    --- Diff: 
flink-runtime/src/main/java/org/apache/flink/runtime/util/ExecutorThreadFactory.java
 ---
    @@ -18,49 +18,112 @@
     
     package org.apache.flink.runtime.util;
     
    +import java.lang.Thread.UncaughtExceptionHandler;
     import java.util.concurrent.ThreadFactory;
     import java.util.concurrent.atomic.AtomicInteger;
     
     import org.slf4j.Logger;
     import org.slf4j.LoggerFactory;
     
    +import static org.apache.flink.util.Preconditions.checkNotNull;
    +
    +/**
    + * A thread 
    --- End diff --
    
    True, incomplete, will fix that.


> Set an UncaughtExceptionHandler for all Thread Pools in JobManager
> ------------------------------------------------------------------
>
>                 Key: FLINK-5759
>                 URL: https://issues.apache.org/jira/browse/FLINK-5759
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 1.2.0
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>             Fix For: 1.3.0
>
>
> Currently, the thread pools of the {{JobManager}} do not have any 
> {{UncaughtExceptionHandler}}.
> While uncaught exceptions are rare (Flink handles exceptions aggressively in 
> most places), when exceptions slip through in these threads (which execute 
> future responses and delayed actions), the JobManager may be in an 
> inconsistent state and not function properly any more.
> We should add a handler that results in a process kill in the case of 
> uncaught exceptions. Letting the JobManager be restarted by the respective 
> cluster framework is the only guaranteed way to be safe.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-5759) Set an UncaughtExceptionHandler for all Thread Pools in JobManager

Reply via email to