lvyanquan commented on code in PR #3619: URL: https://github.com/apache/flink-cdc/pull/3619#discussion_r1857603559
########## flink-cdc-connect/flink-cdc-source-connectors/flink-cdc-base/src/main/java/org/apache/flink/cdc/connectors/base/source/assigner/SnapshotSplitAssigner.java: ########## @@ -397,6 +491,27 @@ && allSnapshotSplitsFinished()) { } LOG.info("Snapshot split assigner is turn into finished status."); } + + if (splitFinishedCheckpointIds != null && !splitFinishedCheckpointIds.isEmpty()) { + Iterator<Map.Entry<String, Long>> iterator = + splitFinishedCheckpointIds.entrySet().iterator(); + while (iterator.hasNext()) { + Map.Entry<String, Long> splitFinishedCheckpointId = iterator.next(); + String splitId = splitFinishedCheckpointId.getKey(); + Long splitCheckpointId = splitFinishedCheckpointId.getValue(); + if (splitCheckpointId != UNDEFINED_CHECKPOINT_ID + && checkpointId >= splitCheckpointId) { Review Comment: > I'm wondering why do we need to update the metric with the help of checkpointId information? Got it. It is used to record the splits processed by downstream, because `onFinishedSplits` doesn't mean that splits are written to downstream, we need to wait for a checkpoint to ensure that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org