[ 
https://issues.apache.org/jira/browse/FLINK-6284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008047#comment-16008047
 ] 

ASF GitHub Bot commented on FLINK-6284:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/3884

    [FLINK-6284] Correct sorting of completed checkpoints in 
ZooKeeperStateHandleStore

    In order to store completed checkpoints in an increasing order in ZooKeeper,
    the paths for the completed checkpoint is no generated by
    `String.format("/%019d", checkpointId)` instead of `String.format("/%s", 
checkpointId)`.
    This makes sure that the converted long will always have the same length 
with
    leading 0s.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink fixZooKeeperSorting

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3884.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3884
    
----
commit 5bd499329d68c6f3236b4e89ba25fdb9acb7e422
Author: Till Rohrmann <trohrm...@apache.org>
Date:   2017-05-12T12:23:37Z

    [FLINK-6284] Correct sorting of completed checkpoints in 
ZooKeeperStateHandleStore
    
    In order to store completed checkpoints in an increasing order in ZooKeeper,
    the paths for the completed checkpoint is no generated by
    String.format("/%019d", checkpointId) instead of String.format("/%s", 
checkpointId).
    This makes sure that the converted long will always have the same length 
with
    leading 0s.

----


> Incorrect sorting of completed checkpoints in 
> ZooKeeperCompletedCheckpointStore
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-6284
>                 URL: https://issues.apache.org/jira/browse/FLINK-6284
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>            Reporter: Xiaogang Shi
>            Priority: Blocker
>             Fix For: 1.3.0
>
>
> Now all completed checkpoints are sorted in their paths when they are 
> recovered in {{ZooKeeperCompletedCheckpointStore}} . In the cases where the 
> latest checkpoint's id is not the largest in lexical order (e.g., "100" is 
> smaller than "99" in lexical order), Flink will not recover from the latest 
> completed checkpoint.
> The problem can be easily observed by setting the checkpoint ids in 
> {{ZooKeeperCompletedCheckpointStoreITCase#testRecover()}} to be 99, 100 and 
> 101. 
> To fix the problem, we should explicitly sort found checkpoints in their 
> checkpoint ids, without the usage of 
> {{ZooKeeperStateHandleStore#getAllSortedByName()}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to