Hi Tzanko, Maybe Till is more appropriate to answer this question.
Thanks, vino. Tzanko Matev <tsa...@gmail.com> 于2018年9月19日周三 下午5:47写道: > Dear all, > > I am currently experimenting with a Flink 1.6.0 job cluster. The goal is > to run a streaming job on K8s. Right now I am using docker-compose to > experiment with the job cluster. > > I am trying to set-up HA with Zookeeper, but I seem to fail. I have a > docker-compose file which contains the following services: > - Zookeeper > - Flink job manager > - Flink task manager > > The containers are set up as per the documentation for docker-compose, but > I have also set up the necessary HA settings in the conf file. However, > when I kill the job manager container and start it again, the job being > processed does not recover but always starts from scratch. Instead I get > the following error: > > > ERROR org.apache.flink.runtime.rest.handler.job.JobDetailsHandler - > Could not retrieve the redirect address. > > > > java.util.concurrent.CompletionException: > org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing > token not set: Ignoring message > LocalFencedMessage(8c4887f5c13f6d907d82a55d97ac428f, > LocalRpcInvocation(requestRestAddress(Time))) sent to > akka.tcp://flink@blockprocessor-job-cluster:50000/user/dispatcher because > the fencing token is null. > > Am I missing something? Is HA implemented for job clusters at all? > > Best wishes, > Tzanko Matev > >