[ https://issues.apache.org/jira/browse/FLINK-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712443#comment-15712443 ]
ASF GitHub Bot commented on FLINK-5214: --------------------------------------- GitHub user tillrohrmann opened a pull request: https://github.com/apache/flink/pull/2918 [FLINK-5214] Clean up checkpoint data in case of a failing checkpoint operation Adds exception handling to the stream operators for the snapshotState method. In case of an exception while performing the snapshot operation, all until then checkpointed data will be discarded/deleted. This makes sure that a failing checkpoint operation won't leave orphaned checkpoint data (e.g. files) behind. Add test case for FsCheckpointStateOutputStream Add RocksDB FullyAsyncSnapshot cleanup test Add proper state cleanup tests for window operator Add state cleanup test for failing snapshot call of AbstractUdfStreamOperator cc @StephanEwen You can merge this pull request into a Git repository by running: $ git pull https://github.com/tillrohrmann/flink fixTaskCheckpointFailure Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2918.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2918 ---- commit 35fc74dd501fc49aa0b55f415c85c2140206220a Author: Till Rohrmann <trohrm...@apache.org> Date: 2016-12-01T12:25:05Z [FLINK-5214] Clean up checkpoint data in case of a failing checkpoint operation Adds exception handling to the stream operators for the snapshotState method. In case of an exception while performing the snapshot operation, all until then checkpointed data will be discarded/deleted. This makes sure that a failing checkpoint operation won't leave orphaned checkpoint data (e.g. files) behind. Add test case for FsCheckpointStateOutputStream Add RocksDB FullyAsyncSnapshot cleanup test Add proper state cleanup tests for window operator Add state cleanup test for failing snapshot call of AbstractUdfStreamOperator ---- > Clean up checkpoint files when failing checkpoint operation on TM > ----------------------------------------------------------------- > > Key: FLINK-5214 > URL: https://issues.apache.org/jira/browse/FLINK-5214 > Project: Flink > Issue Type: Bug > Components: TaskManager > Affects Versions: 1.2.0, 1.1.3 > Reporter: Till Rohrmann > Assignee: Till Rohrmann > Fix For: 1.2.0, 1.1.4 > > > When the {{StreamTask#performCheckpoint}} operation fails on a > {{TaskManager}} potentially created checkpoint files are not cleaned up. This > should be changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)