Dear all,

I am currently experimenting with a Flink 1.6.0 job cluster. The goal is to
run a streaming job on K8s. Right now I am using docker-compose to
experiment with the job cluster.

I am trying to set-up HA with Zookeeper, but I seem to fail. I have a
docker-compose file which contains the following services:
- Zookeeper
- Flink job manager
- Flink task manager

The containers are set up as per the documentation for docker-compose, but
I have also set up the necessary HA settings in the conf file. However,
when I kill the job manager container and start it again, the job being
processed does not recover but always starts from scratch. Instead I get
the following error:

> ERROR org.apache.flink.runtime.rest.handler.job.JobDetailsHandler  -
Could not retrieve the redirect address.
>
> java.util.concurrent.CompletionException:
org.apache.flink.runtime.rpc.exceptions.FencingTokenException: Fencing
token not set: Ignoring message
LocalFencedMessage(8c4887f5c13f6d907d82a55d97ac428f,
LocalRpcInvocation(requestRestAddress(Time))) sent to
akka.tcp://flink@blockprocessor-job-cluster:50000/user/dispatcher because
the fencing token is null.

Am I missing something? Is HA implemented for job clusters at all?

Best wishes,
Tzanko Matev

Reply via email to