[ 
https://issues.apache.org/jira/browse/FLINK-6773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16050508#comment-16050508
 ] 

ASF GitHub Bot commented on FLINK-6773:
---------------------------------------

GitHub user StefanRRichter opened a pull request:

    https://github.com/apache/flink/pull/4130

    [FLINK-6773] [checkpoint] Introduce compression (snappy) for keyed st…

    This PR introduce optional snappy compression for the keyed state in full 
checkpoints and savepoints. This feature can be activated through a flag in 
{{ExecutionConfig}}.
    
    For the future, we can also support user-defined compression schemes, which 
will also require a upgrade and compatibility feature, as described in 
FLINK-6931.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StefanRRichter/flink compressedKeyGroups

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4130.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4130
    
----

----


> Use compression (e.g. snappy) for full check/savepoints
> -------------------------------------------------------
>
>                 Key: FLINK-6773
>                 URL: https://issues.apache.org/jira/browse/FLINK-6773
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>            Reporter: Stefan Richter
>            Assignee: Stefan Richter
>
> We could use compression (e.g. snappy stream compression) to decrease the 
> size of our full checkpoints and savepoints. From some initial experiments, I 
> think there is great potential to achieve compression rates around 30-50%. 
> Given those numbers, I think this is very low hanging fruit to implement.
> One point to consider in the implementation is that compression blocks should 
> respect key-groups, i.e. typically it should make sense to compress per 
> key-group.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to