Till Rohrmann created FLINK-2790:
------------------------------------
Summary: Add high availability support for Yarn
Key: FLINK-2790
URL: https://issues.apache.org/jira/browse/FLINK-2790
Project: Flink
Issue Type: Sub-task
Reporter: Till Rohrmann
Add master high availability support for Yarn. The idea is to let Yarn restart
a failed application master in a new container. For that, we set the number of
application retries to something greater than 1.
>From version 2.4.0 onwards, it is possible to reuse already started containers
>for the TaskManagers, thus, avoiding unnecessary restart delays.
>From version 2.6.0 onwards, it is possible to specify an interval in which the
>number of application attempts have to be exceeded in order to fail the job.
>This will prevent long running jobs from eventually depleting all available
>application attempts.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)