I think you need to enable the HA(high availability) for your Flink
cluster[1]. Currently,
we have the ZooKeeperHAService and KubernetesHAService. In the HA mode,
all the meta data(e.g. job graph path, checkpoint counter, checkpoint path)
will be
stored on ZooKeeper or Kubernetes ConfigMap. And the
Hello -
I am running some testing with flink on Kubernetes. Every let's say five to ten
days, all the jobs disappear from running jobs. There's nothing under completed
jobs, and there's no record of the submitted jar files in the cluster.
In some manner or another, it is almost like going into