Till Rohrmann created FLINK-2790:
------------------------------------

             Summary: Add high availability support for Yarn
                 Key: FLINK-2790
                 URL: https://issues.apache.org/jira/browse/FLINK-2790
             Project: Flink
          Issue Type: Sub-task
            Reporter: Till Rohrmann


Add master high availability support for Yarn. The idea is to let Yarn restart 
a failed application master in a new container. For that, we set the number of 
application retries to something greater than 1. 

>From version 2.4.0 onwards, it is possible to reuse already started containers 
>for the TaskManagers, thus, avoiding unnecessary restart delays.

>From version 2.6.0 onwards, it is possible to specify an interval in which the 
>number of application attempts have to be exceeded in order to fail the job. 
>This will prevent long running jobs from eventually depleting all available 
>application attempts.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to