[ 
https://issues.apache.org/jira/browse/FLINK-5759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15859733#comment-15859733
 ] 

ASF GitHub Bot commented on FLINK-5759:
---------------------------------------

GitHub user StephanEwen opened a pull request:

    https://github.com/apache/flink/pull/3290

    [FLINK-5759] [jobmanager] Set UncaughtExceptionHandlers for JobManager's 
Future and I/O thread pools

    Currently, the thread pools of the `JobManager` do not have any 
`UncaughtExceptionHandler`.
    
    While uncaught exceptions are rare (Flink handles exceptions aggressively 
in most places), when exceptions slip through in these threads (which execute 
future responses and delayed actions), the `JobManager` may be in an 
inconsistent state and not function properly any more.
    
    This pull request adds a handler that results in a process kill in the case 
of uncaught exceptions. Letting the JobManager be restarted by the respective 
cluster framework is the only guaranteed way to be safe.
    
    This also unifies the `ExecutorThreadFactory` and `NamedThreadFactory`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink uncaught_handlers

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3290.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3290
    
----
commit 3602631353dbdf230044db7fba1890600e648101
Author: Stephan Ewen <se...@apache.org>
Date:   2017-02-09T13:04:17Z

    [FLINK-5759] [jobmanager] Set UncaughtExceptionHandlers for JobManager's 
Future and I/O thread pools

----


> Set an UncaughtExceptionHandler for all Thread Pools in JobManager
> ------------------------------------------------------------------
>
>                 Key: FLINK-5759
>                 URL: https://issues.apache.org/jira/browse/FLINK-5759
>             Project: Flink
>          Issue Type: Bug
>          Components: JobManager
>    Affects Versions: 1.2.0
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>             Fix For: 1.3.0
>
>
> Currently, the thread pools of the {{JobManager}} do not have any 
> {{UncaughtExceptionHandler}}.
> While uncaught exceptions are rare (Flink handles exceptions aggressively in 
> most places), when exceptions slip through in these threads (which execute 
> future responses and delayed actions), the JobManager may be in an 
> inconsistent state and not function properly any more.
> We should add a handler that results in a process kill in the case of 
> uncaught exceptions. Letting the JobManager be restarted by the respective 
> cluster framework is the only guaranteed way to be safe.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to