[jira] [Commented] (FLINK-6328) Savepoints must not be counted as retained checkpoints

Till Rohrmann (JIRA) Mon, 22 May 2017 07:55:30 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-6328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16019663#comment-16019663
 ]


Till Rohrmann commented on FLINK-6328:
--------------------------------------

Given that the lifecycle of a savepoint is out of control of the 
{{CheckpointCoordinator}}, I think it is best to not add savepoints to the 
{{CompletedCheckpointStore}} and, thus, not considering them for job recovery. 
The reason for this is FLINK-4815, because otherwise a single broken/deleted 
savepoint will thwart Flink's whole recovery mechanism.

Once FLINK-4815 has been added we might think again about re-adding savepoints 
to the {{CompletedCheckpointStore}} and, thus, allowing to recover from 
savepoints in case of failures. When doing so, we should, however, not count 
the savepoints for the number of retained checkpoints, because we cannot be 
sure that they still exist.

> Savepoints must not be counted as retained checkpoints
> ------------------------------------------------------
>
>                 Key: FLINK-6328
>                 URL: https://issues.apache.org/jira/browse/FLINK-6328
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.2.0, 1.3.0, 1.4.0
>            Reporter: Stephan Ewen
>            Assignee: Till Rohrmann
>            Priority: Blocker
>             Fix For: 1.3.0, 1.2.2
>
>
> The Checkpoint Store retains the *n* latest checkpoints.
> Savepoints are counted as well, meaning that for settings with 1 retained 
> checkpoint, there are sometimes no retained checkpoints at all, only a 
> savepoint.
> That is dangerous, because savepoints must be assumed to disappear at any 
> point in time - their lifecycle is out of control of the 
> CheckpointCoordinator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (FLINK-6328) Savepoints must not be counted as retained checkpoints

Reply via email to