Hi Anil, The reason why we are using Docker is because internally we support Dockerized container for microservices.
Ideally speaking this can be any external service running on something other than the actual YARN cluster you Flink application resides. Basically watchdog runs outside of the Flink cluster: watchdog is designed to capture failures that is not self-recoverable by YARN/Flink alone, for example a schema evolution in source/sink; corrupted data that needs to be skipped; etc. because of this nature, it does not make sense to run it on the same YARN cluster. We have enabled HA in Flink's JM now but not at the time of the presentation. I CCed Peter who might be able to answer this question better. Thanks, Rong On Sat, May 11, 2019 at 10:12 PM Anil <anilsingh....@gmail.com> wrote: > Thanks Rong. FlinkForward talk was insightful. > One more question, it's mentioned in the talk that the jobs are running on > Yarn and are monitored by containers running on Docker. Can you explain why > is Docker needed here. When we deploy job to Yarn, one Yarn container is > already dedicated for Job Manager which monitors the job. What additional > functionality does Docker provide here. > Also when the jobs are deployed on Yarn, the Master Node becomes a Single > point of failure. Are you using a Multi-Master setup or have taken another > approach to handle failover. > Regards, > Anil. > > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >