[ https://issues.apache.org/jira/browse/FLINK-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16050508#comment-16050508 ]
ASF GitHub Bot commented on FLINK-6773: --------------------------------------- GitHub user StefanRRichter opened a pull request: https://github.com/apache/flink/pull/4130 [FLINK-6773] [checkpoint] Introduce compression (snappy) for keyed st… This PR introduce optional snappy compression for the keyed state in full checkpoints and savepoints. This feature can be activated through a flag in {{ExecutionConfig}}. For the future, we can also support user-defined compression schemes, which will also require a upgrade and compatibility feature, as described in FLINK-6931. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StefanRRichter/flink compressedKeyGroups Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/4130.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4130 ---- ---- > Use compression (e.g. snappy) for full check/savepoints > ------------------------------------------------------- > > Key: FLINK-6773 > URL: https://issues.apache.org/jira/browse/FLINK-6773 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing > Reporter: Stefan Richter > Assignee: Stefan Richter > > We could use compression (e.g. snappy stream compression) to decrease the > size of our full checkpoints and savepoints. From some initial experiments, I > think there is great potential to achieve compression rates around 30-50%. > Given those numbers, I think this is very low hanging fruit to implement. > One point to consider in the implementation is that compression blocks should > respect key-groups, i.e. typically it should make sense to compress per > key-group. -- This message was sent by Atlassian JIRA (v6.4.14#64029)