[ https://issues.apache.org/jira/browse/FLINK-22684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346027#comment-17346027 ]
Anton Kalashnikov commented on FLINK-22684: ------------------------------------------- [~pnowojski] , [~roman_khachatryan] A couple of question: * Should the new parameter be part of CheckpointConfig or SavepointConfig? According to the initial problem, it should be SavepointConfig but I see here naming problem. I mean Savepoint in fact doesn't contain any in-flight data and it will be strange if SavepointConfig has ignoreInFlightData property. * Should this new property be something more complicated than just a boolean? For example, it can be some complex property that allows ignoring in-flight data only for specific operator/subtask. but initially, we can implement only two options NONE or ALL. * How expensive is it in general to load metadata of the in-flight data? I mean, initially, I thought it would make sense to load all the metadata as usual and then, inside the CheckpointCoordinator, do some transformations as needed. But now I think it might be expensive and it might be better to move this logic deeper and not even load it from the storage. > Add the ability to ignore in-flight data on recovery > ---------------------------------------------------- > > Key: FLINK-22684 > URL: https://issues.apache.org/jira/browse/FLINK-22684 > Project: Flink > Issue Type: Improvement > Reporter: Anton Kalashnikov > Priority: Major > > The main case: > * We want to restore the last unaligned checkpoint. > * In-flight data of this checkpoint is corrupted. > * We want to ignore this corrupted data and restore only states. > The idea is having new configuration parameter('ignoreInFlightDataOnRecovery' > or similar). and If it set to true, ignore the metadata of in-flight data on > the Checkpoint Coordinator side. -- This message was sent by Atlassian Jira (v8.3.4#803005)