Hi Rock, If you want to start a ha flink cluster on k8s, the simplest way is to use ZK+HDFS/S3, just as the ha configuration on Yarn. The zookeeper-operator could help the start a zk cluster.[1] Please share more information that why it could not work.
If you are using kubernetes per-job cluster, the job could be recovered when the jm pod crashed and restarted.[2] The savepoint could also be used to get better recovery. [1].https://github.com/pravega/zookeeper-operator [2]. https://github.com/apache/flink/blob/release-1.9/flink-container/kubernetes/README.md#deploy-flink-job-cluster vino yang <yanghua1...@gmail.com> 于2019年11月16日周六 下午5:00写道: > Hi Rock, > > I searched by Google and found a blog[1] talk about how to config JM HA > for Flink on k8s. Do not know whether it suitable for you or not. Please > feel free to refer to it. > > Best, > Vino > > [1]: > http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/ > > Rock <downsidem...@qq.com> 于2019年11月16日周六 上午11:02写道: > >> I'm trying to setup a flink cluster on k8s for production use.But the >> setup here >> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/kubernetes.html >> this >> not ha , when job-manager down and rescheduled >> >> the metadata for running job is lost. >> >> >> >> I tried to use ha setup for zk >> https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobmanager_high_availability.html >> on >> k8s , but can't get it right. >> >> >> >> Stroing job's metadata on k8s using pvc or other external file >> system should be very easy.Is there a way to achieve it. >> >