[ https://issues.apache.org/jira/browse/FLINK-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696217#comment-15696217 ]
ASF GitHub Bot commented on FLINK-5158: --------------------------------------- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/2873 [backport] [FLINK-5158] [ckPtCoord] Handle exceptions from CompletedCheckpointStore in CheckpointCoordinator Backport of the #2872 for the release 1.1 branch. Handle exceptions from the CompletedCheckpointStore properly in the CheckpointCoordinator. This means that in case of an exception, the completed checkpoint will be properly cleaned up and also the triggering of subsequent checkpoints will be started. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink backportFixCheckpointCoordinatorExceptionHandling Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2873.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2873 ---- commit c68c08f7b478f354a5c432f8640a344dcf553190 Author: Till Rohrmann <trohrm...@apache.org> Date: 2016-11-24T17:16:28Z [FLINK-5158] [ckPtCoord] Handle exceptions from CompletedCheckpointStore in CheckpointCoordinator Handle exceptions from the CompletedCheckpointStore properly in the CheckpointCoordinator. This means that in case of an exception, the completed checkpoint will be properly cleaned up and also the triggering of subsequent checkpoints will be started. ---- > Handle ZooKeeperCompletedCheckpointStore exceptions in CheckpointCoordinator > ---------------------------------------------------------------------------- > > Key: FLINK-5158 > URL: https://issues.apache.org/jira/browse/FLINK-5158 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing > Affects Versions: 1.2.0, 1.1.3 > Reporter: Till Rohrmann > Assignee: Till Rohrmann > Fix For: 1.2.0, 1.1.4 > > > The checkpoint coordinator does not properly handle exceptions when trying to > store completed checkpoints. As a result, completed checkpoints are not > properly cleaned up and even worse, the {{CheckpointCoordinator}} might get > stuck stopping triggering checkpoints. -- This message was sent by Atlassian JIRA (v6.3.4#6332)