[ https://issues.apache.org/jira/browse/FLINK-37392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930902#comment-17930902 ]
junzhong qin commented on FLINK-37392: -------------------------------------- [~lcastelli] Thank you for your replay, I will test with your commit. > Kubernetes Operator UpgradeFailureException: HA metadata not available to > restore from last state > ------------------------------------------------------------------------------------------------- > > Key: FLINK-37392 > URL: https://issues.apache.org/jira/browse/FLINK-37392 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator > Reporter: junzhong qin > Priority: Not a Priority > > I run a bouned stream sql on Kubernetes. And i set > {code:java} > kubernetes.operator.jm-deployment.shutdown-ttl: 5 m {code} > When the job exit, the operator log always show > {code:java} > o.a.f.k.o.c.FlinkDeploymentController [ERROR][{ns}/{cluster-id}] Error while > upgrading Flink Deployment > org.apache.flink.kubernetes.operator.exception.UpgradeFailureException: HA > metadata not available to restore from last state. It is possible that the > job has finished or terminally failed, or the configmaps have been deleted. > at > org.apache.flink.kubernetes.operator.service.AbstractFlinkService.validateHaMetadataExists(AbstractFlinkService.java:946) > at > org.apache.flink.kubernetes.operator.service.AbstractFlinkService.submitApplicationCluster(AbstractFlinkService.java:213) > at > org.apache.flink.kubernetes.operator.reconciler.deployment.ApplicationReconciler.deploy(ApplicationReconciler.java:183) > at > org.apache.flink.kubernetes.operator.reconciler.deployment.ApplicationReconciler.deploy(ApplicationReconciler.java:64) > at > org.apache.flink.kubernetes.operator.reconciler.deployment.AbstractJobReconciler.restoreJob(AbstractJobReconciler.java:387) > at > org.apache.flink.kubernetes.operator.reconciler.deployment.AbstractJobReconciler.resubmitJob(AbstractJobReconciler.java:555) > at > org.apache.flink.kubernetes.operator.reconciler.deployment.ApplicationReconciler.reconcileOtherChanges(ApplicationReconciler.java:294) > at > org.apache.flink.kubernetes.operator.reconciler.deployment.AbstractFlinkResourceReconciler.reconcile(AbstractFlinkResourceReconciler.java:173) > at > org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:152) > at > org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:61) > at > io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:153) > at > io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:111) > at > org.apache.flink.kubernetes.operator.metrics.OperatorJosdkMetrics.timeControllerExecution(OperatorJosdkMetrics.java:80) > at > io.javaoperatorsdk.operator.processing.Controller.reconcile(Controller.java:110) > at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:136) > at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:117) > at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:91) > at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:64) > at > io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:452) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.base/java.lang.Thread.run(Unknown Source) {code} > Can delete the HA in Operator when > `kubernetes.operator.jm-deployment.shutdown-ttl` reached? -- This message was sent by Atlassian Jira (v8.20.10#820010)