[ 
https://issues.apache.org/jira/browse/FLINK-30305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17643599#comment-17643599
 ] 

Alexis Sarda-Espinosa commented on FLINK-30305:
-----------------------------------------------

I briefly looked through the code on the main branch, but I see that [this line 
in 
ApplicationReconciler|https://github.com/apache/flink-kubernetes-operator/blob/d382c74ea04fbe17ab41f42559d663d55d21763a/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/ApplicationReconciler.java#L182]
 sets the {{deleteHaConfigMaps}} flag to true when calling {{deleteCluster}}.

Now, I'm not very familiar with Flink's internals, but a few lines above the 
one I linked I see that {{SavepointConfigOptions.SAVEPOINT_PATH}} is set in the 
deployment config if a savepoint is found. Would Flink still honor the 
configured {{SavepointConfigOptions.SAVEPOINT_PATH}} even if it finds HA 
metadata?

Regardless of the answer to that question, let's assume for a moment that we 
follow the scenario I described but the HA CMs are _not_ deleted. I could roll 
back my changes and the operator would detect said CMs, so even though the job 
didn't run successfully again, the operator would propagate the rolled-back 
spec and let the JM continue from the checkpoint stored in the HA metadata 
(savepoint information is not stored in HA metadata, right?). In this case the 
savepoint goes basically unused.

I imagine this is also suboptimal, since rolling back and using a checkpoint 
that is potentially older than the savepoint means some data could be 
re-processed and create duplicates. So I could agree that it's probably best to 
let the user clean up manually and maybe use the savepoint as initial 
savepoint, but I still don't understand why the operator deletes the HA CMs in 
this case; if they were kept, the user could still decide - either rollback and 
use the checkpoint, or do manual cleanup and set the latest savepoint as 
initial.

> Operator deletes HA metadata during stateful upgrade, preventing potential 
> manual rollback
> ------------------------------------------------------------------------------------------
>
>                 Key: FLINK-30305
>                 URL: https://issues.apache.org/jira/browse/FLINK-30305
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.2.0
>            Reporter: Alexis Sarda-Espinosa
>            Priority: Major
>
> I was testing resiliency of jobs with Kubernetes-based HA enabled, upgrade 
> mode = {{savepoint}}, and with _automatic_ rollback _disabled_ in the 
> operator. After the job was running, I purposely created an erroneous spec by 
> changing my pod template to include an entry in {{envFrom -> secretRef}} with 
> a name that doesn't exist. Schema validation passed, so the operator tried to 
> upgrade the job, but the new pod hangs with {{CreateContainerConfigError}}, 
> and I see this in the operator logs:
> {noformat}
> >>> Status | Info    | UPGRADING       | The resource is being upgraded
> Deleting deployment with terminated application before new deployment
> Deleting JobManager deployment and HA metadata.
> {noformat}
> Afterwards, even if I remove the non-existing entry from my pod template, the 
> operator can no longer propagate the new spec because "Job is not running yet 
> and HA metadata is not available, waiting for upgradeable state".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to