[jira] [Commented] (FLINK-4815) Automatic fallback to earlier checkpoints when checkpoint restore fails

Wei-Che Wei (JIRA) Wed, 01 Mar 2017 22:54:07 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-4815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891711#comment-15891711
 ]


Wei-Che Wei commented on FLINK-4815:
------------------------------------

Hi [~uce]

I saw you implement {{ZooKeeperCompletedCheckpointStore}} in FLINK-2354 and I 
found that recover() method in that will remove all other checkpoint instead of 
the latest one.
That means recover() method should be ignored or refined in order to support 
this feature, am I right?
I have no idea if there is any side-effect after I change that implementation. 
As I know, it should be okey to retain all these complete checkpoints, so I am 
confused about the comment you wrote for recovery() in 
{{ZooKeeperCompletedCheckpointStore}}. Could you please explain that for me? 
Looking forward to your feedback. Thank you.

> Automatic fallback to earlier checkpoints when checkpoint restore fails
> -----------------------------------------------------------------------
>
>                 Key: FLINK-4815
>                 URL: https://issues.apache.org/jira/browse/FLINK-4815
>             Project: Flink
>          Issue Type: New Feature
>          Components: State Backends, Checkpointing
>            Reporter: Stephan Ewen
>
> Flink should keep multiple completed checkpoints.
> When the restore of one completed checkpoint fails for a certain number of 
> times, the CheckpointCoordinator should fall back to an earlier checkpoint to 
> restore.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-4815) Automatic fallback to earlier checkpoints when checkpoint restore fails

Reply via email to