Hi, We are on Flink 1.20/Java17 running in a k8s environment, with checkpoints enabled on S3 and the following checkpoint options:
execution.checkpointing.dir: s3://flink-application/checkpoints execution.checkpointing.externalized-checkpoint-retention: DELETE_ON_CANCELLATION execution.checkpointing.interval: 150000 ms execution.checkpointing.min-pause: 30000 ms execution.checkpointing.mode: EXACTLY_ONCE execution.checkpointing.savepoint-dir: s3://flink-application/savepoints execution.checkpointing.timeout: 10 min execution.checkpointing.tolerable-failed-checkpoints: "3" We have been through quite a few flink application restarts due to streaming failure for various reasons (mostly kafka related), but also flink application changes. The Flink application then tends to be resumed from savepoints, but we noticed an increasing number of checkpoints are left behind. Is there a built-in way of cleaning these obsolete checkpoints? I suppose what we do not really understand is the condition(s) under which Flink may not clean up checkpoints. Can someone explain? Thanks JM