Hi Anil, A typical Yarn Resource Manager setting consist of 2 RM nodes [1] for active/standby setup. FYI: We've also shared some practical experiences for the limitation of this setup, and potential redundant fail-save mechanisms in our latest talk[2] in this year's FlinkForward.
Thanks, Rong [1] https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html [2] https://www.ververica.com/resources/flink-forward-san-francisco-2019/-practical-experience-running-flink-in-production On Thu, May 16, 2019 at 5:08 AM Anil <anilsingh....@gmail.com> wrote: > Thanks for the clarification Rong! > As per my understanding, the Docker containers monitors the job Flink Job > which are running in Yarn Cluster. Flink JM's have HA enabled. So there's a > standby JM in case the JM fails and in case of TM failure, that TM will be > re-deployed. All good. My concern is what if the Yarn Master node goes > down. > Is the Yarn cluster running with Multi-master or in case of failure do you > migrate your job do a different cluster. If so is this failover to a > different cluster built into Athenax. > Regards, > Anil. > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >