Nico Kruber created FLINK-11159:
-----------------------------------

             Summary: Allow configuration whether to fall back to savepoints 
for restore
                 Key: FLINK-11159
                 URL: https://issues.apache.org/jira/browse/FLINK-11159
             Project: Flink
          Issue Type: Improvement
          Components: State Backends, Checkpointing
    Affects Versions: 1.7.0, 1.6.2, 1.5.5
            Reporter: Nico Kruber


Ever since FLINK-3397, upon failure, Flink would restart from the latest 
checkpoint/savepoint which ever is more recent. With the introduction of local 
recovery and the knowledge that a RocksDB checkpoint restore would just copy 
the files, it may be time to re-consider / making this configurable:
In certain situations, it may be faster to restore from the latest checkpoint 
only (even if there is a more recent savepoint) and reprocess the data between. 
On the downside, though, that may not be correct because that might break side 
effects if the savepoint was the latest one, e.g. consider this chain: {{chk1 
-> chk2 -> sp … restore chk2 -> …}}. Then all side effects between {{chk2 -> 
sp}} would be reproduced.

Making this configurable will allow the user to set whatever he needs / can to 
get the lowest recovery time in Flink.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to