My suspicion is that somewhere in the path were it fails to connect yo zookeeper, the exception is swallowed, so instead of running the shutdown path for when the job fails, the general shutdown path is taken.
This was fortunately a job for which we had a savepoint from yesterday. Otherwise we would have been in serios problems. On Fri, Sep 4, 2020, at 9:12 PM, Qingdong Zeng wrote: > Hi Cristian, > > In the log,we can see it went to the method > shutDownAsync(applicationStatus,null,true); > > `` > 2020-09-04 17:32:07,950 INFO > org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Shutting > StandaloneApplicationClusterEntryPoint down with application status FAILED. > Diagnostics null. > `` > > In general shutdown path,default to clean up HaData is normal. So the > problem is not why we clean up HaData in general shutdown path,but why it > went to the general shutdown path when your cluster fails. > > I am going to have lunch , and plan to analyze the log in the afternoon. > > Best, > Qingdong Zeng > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >