[ https://issues.apache.org/jira/browse/FLINK-4814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15633263#comment-15633263 ]
ASF GitHub Bot commented on FLINK-4814: --------------------------------------- GitHub user uce opened a pull request: https://github.com/apache/flink/pull/2752 [FLINK-4814] [checkpointing] Use checkpoint directory for externalized checkpoints This change drops the checkpoint directory configuration key and instead uses the configured checkpoint directory of the used backend (for `FsStateBackend` and `RocksDBBackend`). For backends without a checkpoint directory like the `MemoryStateBackend`, you have to explicitly configure a checkpoint directory. Otherwise, the job submission will fail. The externalized checkpoints now use the `FsCheckpointOutputStream`, too. This makes the checkpoint layout very nice for externalized checkpoints, because you end up with the checkpoint meta data together with the actual checkpoint data: ```java :checkpointDir/:jobId/chk-:checkpointId/ +- :uuid // data . +- :uuid // data +- savepoint-:uuid // meta data ``` The checkpoint meta data and actual data is self-contained in a single directory. --- This also changes the target file for savepoint though currently. Before this change you get ```java :savepointDir/savepoint-:rand ``` After this change you get ```java :savepointDir/:jobId/chk-:checkpointId/savepoint-:uuid ``` Is this OK to change? You can merge this pull request into a Git repository by running: $ git pull https://github.com/uce/flink 4814-external_checkpoint_config Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2752.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2752 ---- commit b5d99b8b70ffc5a61a0d3bae20777ed2893313f3 Author: Ufuk Celebi <u...@apache.org> Date: 2016-10-31T12:58:03Z [FLINK-4814] [refactoring] Add prefix option to FsCheckpointOutputStreams - Allows to configure a prefix for generated file names - Add a method to delete the created checkpoint directory commit 7acbc970a72356eea49f603654f324dd8931eaf6 Author: Ufuk Celebi <u...@apache.org> Date: 2016-11-01T15:54:17Z [FLINK-4814] [refactoring] Use FsStreamFactory and Path in SavepointStore - Use the FsStreamFactory instead of manually working with the FileSystem - Use Path instead of String for path arguments commit f35eb0f906d88fdb724a4b17b1983d1af4c99f96 Author: Ufuk Celebi <u...@apache.org> Date: 2016-11-01T16:51:02Z [FLINK-4814] [checkpointing] Use checkpoint directory for externalized checkpoints - Removes the config key for the checkpoint directory - Use the backend checkpoint directory for externalized checkpoints (fs, rocksDB) * With the mem backend, manual configuration is required commit e55fb2ec5444003e114ba0ee90ca4b148c9f1d00 Author: Ufuk Celebi <u...@apache.org> Date: 2016-11-02T15:21:06Z [FLINK-4814] [docs] Add docs about externalized checkpoints ---- > Remove extra storage location for externalized checkpoint metadata > ------------------------------------------------------------------ > > Key: FLINK-4814 > URL: https://issues.apache.org/jira/browse/FLINK-4814 > Project: Flink > Issue Type: Sub-task > Reporter: Ufuk Celebi > > Follow up for FLINK-4512. > Store checkpoint meta data in checkpoint directory. That makes it simpler > for users to track and clean up checkpoints manually, if they want to retain > externalized checkpoints across cancellations and terminal failures. > Every state backend needs to be able to provide a storage location for the > checkpoint metadata. The memory state backend would hence not work with > externalized checkpoints, unless one sets explicitly a parameter > `setExternalizedCheckpointsLocation(uri)`. -- This message was sent by Atlassian JIRA (v6.3.4#6332)