I means that checkpoints are usually dropped after the job was terminated by the user (except if explicitly configured as retained Checkpoints). You could use "ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION" to save your checkpoint when te cames to failure.
When your zookeeper lost connection,the High-Availability system ,which rely on zookeeper was also failure, it leads to your application stop without retry. I hava a question , if your application lost zookeeper connection,how did it delete the data in zookeeper? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/