Aljoscha Krettek created FLINK-7783:
---------------------------------------

             Summary: Don't always remove checkpoints 
ZooKeeperCompletedCheckpointStore#recover()
                 Key: FLINK-7783
                 URL: https://issues.apache.org/jira/browse/FLINK-7783
             Project: Flink
          Issue Type: Sub-task
          Components: State Backends, Checkpointing
    Affects Versions: 1.3.2, 1.4.0
            Reporter: Aljoscha Krettek
            Priority: Blocker
             Fix For: 1.4.0, 1.3.3


Currently, we always delete checkpoint handles if they (or the data from the 
DFS) cannot be read: 
https://github.com/apache/flink/blob/91a4b276171afb760bfff9ccf30593e648e91dfb/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/ZooKeeperCompletedCheckpointStore.java#L180

This can lead to problems in case the DFS is temporarily now available, i.e. we 
could inadvertently
delete all checkpoints even though they are still valid.

A user reported this problem on the mailing list: 
https://lists.apache.org/thread.html/9dc9b719cf8449067ad01114fedb75d1beac7b4dff171acdcc24903d@%3Cuser.flink.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to