Cannot start from savepoint using Flink 1.12 in standalone Kubernetes + Kubernetes HA

陳昌倬 Tue, 29 Dec 2020 17:35:41 -0800

Hi,

We cannot start job from savepoint (created by Flink 1.12, Standalone
Kubernetes + zookeeper HA) in Flink 1.12, Standalone Kubernetes +
Kubernetes HA. The following is the exception that stops the job.


    Caused by: java.util.concurrent.CompletionException: 
org.apache.flink.kubernetes.kubeclient.resources.KubernetesException: Cannot 
retry checkAndUpdateConfigMap with configMap 
name-51e5afd90227d537ff442403d1b279da-jobmanager-leader because it does not 
exist.


Cluster can start new job from scratch, so we think cluster
configuration is good.

The following is HA related config:

    high-availability: 
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
    high-availability.storageDir: gs://some/path/recovery
    kubernetes.cluster-id: cluster-name
    kubernetes.context: kubernetes-context
    kubernetes.namespace: kubernetes-namespace


-- 
ChangZhuo Chen (陳昌倬) czchen@{czchen,debconf,debian}.org
http://czchen.info/
Key fingerprint = BA04 346D C2E1 FE63 C790  8793 CC65 B0CD EC27 5D5B

signature.asc
Description: PGP signature

Cannot start from savepoint using Flink 1.12 in standalone Kubernetes + Kubernetes HA

Reply via email to