Aitozi created FLINK-28531:
------------------------------

             Summary: Shutdown cluster after history server archive finished
                 Key: FLINK-28531
                 URL: https://issues.apache.org/jira/browse/FLINK-28531
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Coordination
            Reporter: Aitozi


I met a problem that the job cluster may be shutdown with history server 
archive file upload not finished.

After some research, It's may be caused by two reason.

First, the {{HistoryServerArchivist#archiveExecutionGraph}} is not wait to 
complete 
Second, the deregisterApp in the 
{{KubernetesResourceManagerDriver#deregisterApplication}} will directly remove 
the deployment. So in the shutdown flow in ClusterEntrypoint, it will first 
trigger the delete deployment, it will cause the master pod deleted with some 
operation/future can not finished



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to