XiaoYou201 opened a new pull request, #10576:
URL: https://github.com/apache/inlong/pull/10576

   ### [INLONG-10575][Sort] Make mysql source support report audit information 
exactly once
   
   Fixes #10575
   
   ### Modifications
   1. Using the checkpoint principle in Flink to modify the process, the 
modified flow chart is shown in Figure 1 and Figure 2. In Figure 1, the 
callback method of notifyCompleteCheckpoint is used to upload audit information 
instead of scheduled upload.Each Source/Sink will save the checkpointId of the 
currently ongoing checkpoint, which is nowCheckpointId in the figure. When the 
audit information is written to the Buffer, nowCheckpointId will be attached, 
indicating that the audit information
   is written during this ongoing checkpoint. The audit information and 
checkpointId are in a many-to-one relationship.
   
   
![image](https://github.com/apache/inlong/assets/58425449/2f09c6c6-e9c2-4383-bca7-965333ccab65)
   When a snapshot request is received, the current operator's nowCheckpointId 
is updated to (snapshot) checkpointId + 1. When all operators in a task 
complete the snapshot, the notifyCompleteCheckpoint method is called back.
   
   At this time, AuditBuffer uploads audit information less than or equal to 
checkpointId (parameters in the notifyCompleteCheckpoint method).
   
   
![image](https://github.com/apache/inlong/assets/58425449/33d2de48-df21-49ce-96c7-d044d25f37b0)
   
   2. The getCurConsumedPartitions method gets the partitions assigned to the 
client by the tube server, including the partitions where the client has 
consumed data and the client has not consumed data. According to the previous 
logic, the offsets of the partitions that have not been consumed are not 
recorded. Here, the offsets of the partitions that have not been consumed are 
added.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@inlong.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to