Hi all, I have a Flink HA cluster with 2 job managers and a zookeeper quorum of 3 nodes.
My failed job manager didn't get recovered after I killed it. Here is how I didn't it and what I've observed: 1. I started the HA cluster with start-cluster.sh 2. Job manager A got elected. 3. I killed job manager A with kill command. 4. Job manager B got elected. 5. Job manager B was working well. 6. But job manager A never recovered since then. Do I miss something here or is it the case that HA cannot handle such failover(the flink instance gets killed directly)? Thanks! Best regards, Mu