[ https://issues.apache.org/jira/browse/FLINK-30305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17643599#comment-17643599 ]
Alexis Sarda-Espinosa commented on FLINK-30305: ----------------------------------------------- I briefly looked through the code on the main branch, but I see that [this line in ApplicationReconciler|https://github.com/apache/flink-kubernetes-operator/blob/d382c74ea04fbe17ab41f42559d663d55d21763a/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/ApplicationReconciler.java#L182] sets the {{deleteHaConfigMaps}} flag to true when calling {{deleteCluster}}. Now, I'm not very familiar with Flink's internals, but a few lines above the one I linked I see that {{SavepointConfigOptions.SAVEPOINT_PATH}} is set in the deployment config if a savepoint is found. Would Flink still honor the configured {{SavepointConfigOptions.SAVEPOINT_PATH}} even if it finds HA metadata? Regardless of the answer to that question, let's assume for a moment that we follow the scenario I described but the HA CMs are _not_ deleted. I could roll back my changes and the operator would detect said CMs, so even though the job didn't run successfully again, the operator would propagate the rolled-back spec and let the JM continue from the checkpoint stored in the HA metadata (savepoint information is not stored in HA metadata, right?). In this case the savepoint goes basically unused. I imagine this is also suboptimal, since rolling back and using a checkpoint that is potentially older than the savepoint means some data could be re-processed and create duplicates. So I could agree that it's probably best to let the user clean up manually and maybe use the savepoint as initial savepoint, but I still don't understand why the operator deletes the HA CMs in this case; if they were kept, the user could still decide - either rollback and use the checkpoint, or do manual cleanup and set the latest savepoint as initial. > Operator deletes HA metadata during stateful upgrade, preventing potential > manual rollback > ------------------------------------------------------------------------------------------ > > Key: FLINK-30305 > URL: https://issues.apache.org/jira/browse/FLINK-30305 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator > Affects Versions: kubernetes-operator-1.2.0 > Reporter: Alexis Sarda-Espinosa > Priority: Major > > I was testing resiliency of jobs with Kubernetes-based HA enabled, upgrade > mode = {{savepoint}}, and with _automatic_ rollback _disabled_ in the > operator. After the job was running, I purposely created an erroneous spec by > changing my pod template to include an entry in {{envFrom -> secretRef}} with > a name that doesn't exist. Schema validation passed, so the operator tried to > upgrade the job, but the new pod hangs with {{CreateContainerConfigError}}, > and I see this in the operator logs: > {noformat} > >>> Status | Info | UPGRADING | The resource is being upgraded > Deleting deployment with terminated application before new deployment > Deleting JobManager deployment and HA metadata. > {noformat} > Afterwards, even if I remove the non-existing entry from my pod template, the > operator can no longer propagate the new spec because "Job is not running yet > and HA metadata is not available, waiting for upgradeable state". -- This message was sent by Atlassian Jira (v8.20.10#820010)