leonardBang commented on code in PR #3349:
URL: https://github.com/apache/flink-cdc/pull/3349#discussion_r1623938661


##########
docs/content/docs/connectors/flink-sources/postgres-cdc.md:
##########
@@ -236,6 +236,17 @@ Connector Options
           so it does not need to be explicitly configured 
'execution.checkpointing.checkpoints-after-tasks-finish.enabled' = 'true'
       </td>
     </tr>
+    <tr>
+      <td>scan.lsn-commit.checkpoints-num-delay</td>
+      <td>optional</td>
+      <td style="word-wrap: break-word;">3</td>
+      <td>Integer</td>
+      <td>The number of checkpoint delays before starting to commit the LSN 
offsets. <br>
+          The checkpoint LSN offsets will be committed in rolling fashion, the 
earliest checkpoint identifier will be committed first from the delayed 
checkpoints.
+          This will enable continuous recycling of log files, preventing disk 
space issues. <br>
+          This feature is not available in `PostgreSQLSource` since it is 
deprecated.

Review Comment:
   When consuming PostgreSQL logs, the LSN offset must be committed to trigger 
the log data cleanup for the corresponding slot. However, once the LSN offset 
is committed, earlier offsets become invalid. To ensure access to earlier LSN 
offsets for job recovery, we delay the LSN commit by 3 checkpoints by default. 
This feature is available when config option 
`scan.incremental.snapshot.enabled` is set to true.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to