subject:"JobManager doesn't recover in HA mode"

Re: JobManager doesn't recover in HA mode

2018-01-31 Thread Mu Kong

Ah, I think I can just use ./bin/jobmanager.sh https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/cluster_setup.html#adding-a-jobmanager Thanks! On Thu, Feb 1, 2018 at 4:00 PM, Mu Kong wrote: > Hi Tony, > > Thanks for your response! > I would definitely check supervisord

Re: JobManager doesn't recover in HA mode

2018-01-31 Thread Mu Kong

Hi Tony, Thanks for your response! I would definitely check supervisord. I wonder if there is a way that I can recover the killed JM and add it back to the cluster by using one of the scripts in the *flink/bin/* Thanks! Best regards, Mu On Thu, Feb 1, 2018 at 3:50 PM, Tony Wei wrote: > Hi

Re: JobManager doesn't recover in HA mode

2018-01-31 Thread Tony Wei

Hi Mu, AFAIK, that is the expected behavior when you launch your cluster in standalone mode. Flink HA guarantees that the standby JM will take over the whole cluster. The illustration just said recovered JM will become another standby machine, but recovering a single instance is not the Flink HA's

JobManager doesn't recover in HA mode

2018-01-31 Thread Mu Kong

Hi all, I have a Flink HA cluster with 2 job managers and a zookeeper quorum of 3 nodes. My failed job manager didn't get recovered after I killed it. Here is how I didn't it and what I've observed: 1. I started the HA cluster with start-cluster.sh 2. Job manager A got elected. 3. I killed job m