Re: [PR] [FLINK-36315][cdc-base]The flink-cdc-base module supports source metric statistics [flink-cdc]

via GitHub Mon, 25 Nov 2024 21:51:01 -0800


lvyanquan commented on code in PR #3619:
URL: https://github.com/apache/flink-cdc/pull/3619#discussion_r1857603559



##########
flink-cdc-connect/flink-cdc-source-connectors/flink-cdc-base/src/main/java/org/apache/flink/cdc/connectors/base/source/assigner/SnapshotSplitAssigner.java:
##########
@@ -397,6 +491,27 @@ && allSnapshotSplitsFinished()) {
             }
             LOG.info("Snapshot split assigner is turn into finished status.");
         }
+
+        if (splitFinishedCheckpointIds != null && 
!splitFinishedCheckpointIds.isEmpty()) {
+            Iterator<Map.Entry<String, Long>> iterator =
+                    splitFinishedCheckpointIds.entrySet().iterator();
+            while (iterator.hasNext()) {
+                Map.Entry<String, Long> splitFinishedCheckpointId = 
iterator.next();
+                String splitId = splitFinishedCheckpointId.getKey();
+                Long splitCheckpointId = splitFinishedCheckpointId.getValue();
+                if (splitCheckpointId != UNDEFINED_CHECKPOINT_ID
+                        && checkpointId >= splitCheckpointId) {

Review Comment:
   > I'm wondering why do we need to update the metric with the help of 
checkpointId information?
   
   Got it. It is used to record the splits processed by downstream, because 
`onFinishedSplits` doesn't mean that splits are written to downstream, we need 
to wait for a checkpoint to ensure that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [FLINK-36315][cdc-base]The flink-cdc-base module supports source metric statistics [flink-cdc]

Reply via email to