zoltar9264 commented on code in PR #21822: URL: https://github.com/apache/flink/pull/21822#discussion_r1170713429
########## flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/DuplicatingStateChangeFsUploader.java: ########## @@ -51,14 +52,15 @@ * <li>Store the meta of files into {@link ChangelogTaskLocalStateStore} by * AsyncCheckpointRunnable#reportCompletedSnapshotStates(). * <li>Pass control of the file to {@link LocalChangelogRegistry#register} when - * ChangelogKeyedStateBackend#notifyCheckpointComplete() , files of the previous - * checkpoint will be deleted by {@link LocalChangelogRegistry#discardUpToCheckpoint} at - * the same time. + * FsStateChangelogWriter#persist , files of the previous checkpoint will be deleted by + * {@link LocalChangelogRegistry#discardUpToCheckpoint} when the previous checkpoint is + * confirmed. Review Comment: The problem of not cleaning up useless local files after truncate was not actively addressed before this PR. The previous cleaning method of _LocalChangelogRegistryImpl#prune()_ can indeed avoid the accumulation of local changelog files, but this cleaning method itself is problematic because it does not consider that these files will continue to be used by subsequent checkpoints. Before this PR, the remote changelog file did not consider the cleanup problem when the checkpoint cannot be made successfully. The local changelog file just happens to avoid this problem due to the wrong way to clean up the file. So I don't think it can simply be considered that this PR has brought new problems. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org