Re: Checkpoint metadata deleted by Flink after ZK connection issues

Cristian Fri, 04 Sep 2020 21:24:10 -0700


My suspicion is that somewhere in the path were it fails to connect yo 
zookeeper, the exception is swallowed, so instead of running the shutdown path 
for when the job fails, the general shutdown path is taken.


This was fortunately a job for which we had a savepoint from yesterday. 
Otherwise we would have been in serios problems. 


On Fri, Sep 4, 2020, at 9:12 PM, Qingdong Zeng wrote:
> Hi Cristian，
> 
> In the log,we can see it went to the method
> shutDownAsync(applicationStatus,null,true);
>                                       
> ``   
> 2020-09-04 17:32:07,950 INFO 
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Shutting
> StandaloneApplicationClusterEntryPoint down with application status FAILED.
> Diagnostics null.
> ``   
> 
> In general shutdown path,default to clean up HaData is normal. So the
> problem is not why we clean up HaData in general shutdown path，but why it
> went to the general shutdown path when your cluster fails.
> 
> I am going to have lunch , and plan to  analyze the log in the afternoon.
> 
> Best,
> Qingdong Zeng
> 
> 
> 
> --
> Sent from: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Re: Checkpoint metadata deleted by Flink after ZK connection issues

Reply via email to