Re: TaskManager HA on YARN

2017-12-04 Thread Till Rohrmann
Hi Hayden, in Yarn mode, Flink will tolerate as many TM failures as you have configured `yarn.maximum-failed-containers`. Per default this is set to the initial number of requested TMs. So in your case, the Flink cluster would restart twice a TM and then fail the cluster once a TM fails for the th

TaskManager HA on YARN

2017-12-04 Thread Marchant, Hayden
Hi, WE are currently start to test Flink running on YARN. Till now, we've been testing on Standalone Cluster. One thing lacking in standalone is that we have to manually restart a Task Manager if it dies. I looked at https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_h