Dear Flink Community, In our Flink application, we persist checkpoints to AWS S3. Recently, during periods of high job parallelism and traffic, we've experienced checkpoint failures. Upon investigating, it appears these may be related to S3 delete object requests interrupting checkpoint re-uploads, as evidenced by numerous InterruptedExceptions.
We aim to explore options for disabling the deletion of stale checkpoints. Despite consulting the Flink configuration documentation and conducting various tests, the appropriate setting to prevent old checkpoint cleanup remains elusive. Could you advise if there's a method to disable the automatic cleanup of old Flink checkpoints? Best, Yang