[ https://issues.apache.org/jira/browse/FLINK-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696175#comment-15696175 ]
ASF GitHub Bot commented on FLINK-5158: --------------------------------------- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/2872 [FLINK-5158] [ckPtCoord] Handle exceptions from CompletedCheckpointStore in CheckpointCoordinator Handle exceptions from the CompletedCheckpointStore properly in the CheckpointCoordinator. This means that in case of an exception, the completed checkpoint will be properly cleaned up and also the triggering of subsequent checkpoints will be started. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink fixCheckpointCoordinatorExceptionHandling Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2872.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2872 ---- commit 063a696b4eb5a259c714818c0b0ba5cc883a596d Author: Till Rohrmann <trohrm...@apache.org> Date: 2016-11-24T17:16:28Z [FLINK-5158] [ckPtCoord] Handle exceptions from CompletedCheckpointStore in CheckpointCoordinator Handle exceptions from the CompletedCheckpointStore properly in the CheckpointCoordinator. This means that in case of an exception, the completed checkpoint will be properly cleaned up and also the triggering of subsequent checkpoints will be started. ---- > Handle ZooKeeperCompletedCheckpointStore exceptions in CheckpointCoordinator > ---------------------------------------------------------------------------- > > Key: FLINK-5158 > URL: https://issues.apache.org/jira/browse/FLINK-5158 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing > Affects Versions: 1.2.0, 1.1.3 > Reporter: Till Rohrmann > Assignee: Till Rohrmann > Fix For: 1.2.0, 1.1.4 > > > The checkpoint coordinator does not properly handle exceptions when trying to > store completed checkpoints. As a result, completed checkpoints are not > properly cleaned up and even worse, the {{CheckpointCoordinator}} might get > stuck stopping triggering checkpoints. -- This message was sent by Atlassian JIRA (v6.3.4#6332)