[ https://issues.apache.org/jira/browse/FLINK-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697931#comment-16697931 ]
Steven Zhen Wu commented on FLINK-7883: --------------------------------------- We would love to see this happening. it is the "graceful" shutdown that we need to reduce/minimize duplicates if we are going to enable aggressive/frequent rescale events. otherwise, we are going to see frequent and significant duplicates. > Make savepoints atomic with respect to state and side effects > ------------------------------------------------------------- > > Key: FLINK-7883 > URL: https://issues.apache.org/jira/browse/FLINK-7883 > Project: Flink > Issue Type: Improvement > Components: DataStream API, Kafka Connector, State Backends, > Checkpointing > Affects Versions: 1.3.2, 1.4.0 > Reporter: Antoine Philippot > Priority: Major > > For a cancel with savepoint command, the JobManager trigger the cancel call > once the savepoint is finished, but during the savepoint execution, kafka > source continue to poll new messages which will not be part of the savepoint > and will be replayed on the next application start. > A solution could be to stop fetching the source stream task before triggering > the savepoint. > I suggest to add an interface {{StoppableFetchingSourceFunction}} with a > method {{stopFetching}} that existant SourceFunction implementations could > implement. > We can add a {{stopFetchingSource}} property in > {{CheckpointOptions}} class to pass the desired behaviour from > {{JobManager.handleMessage(CancelJobWithSavepoint)}} to > {{SourceStreamTask.triggerCheckpoint}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)