Hi, I have a question regarding flink checkpointing configuration.
I am obtaining my knowledge from the official docs here: https://nightlies.apache.org/flink/flink-docs-stable/docs/deployment/config/ and running Flink 1.14.4 I would like to be able to do a checkpoint every 10 minutes which at least 10 minutes pause between checkpoints. Thus I have set the following properties: execution.checkpointing.interval: 10 min execution.checkpointing.min-pause: 10 min And that works for the positive scenarios where my job runs fine. However, when we have a checkpoint timeout it seems that the min-pause is not applied, e.g.: t - checkpoint #1 starts t+10min - checkpoints #1 fails due to timeout (execution.checkpointing.timeout defaults to 10min) t+10min - checkpoint #2 starts I would expect (and want to achieve): t - checkpoint #1 starts t+10min - checkpoints #1 fails due to timeout (execution.checkpointing.timeout defaults to 10min) t+20min (t+10min(timeout)+10min(min-pause) - checkpoint #2 starts I expect that because: - the checkpoint #1 at t+10min did not succeed, but it finished at t+10min. I expect the min-pause to start counting from there. - if checkpoint #1 failed with timeout it's very unlikely checkpoint #2 which starts immediately after the failed checkpoint #1 to succeed At this point I am not sure whether I do not understand the docs and how I should configure my job. When I set the configuration like so: execution.checkpointing.interval: 10 min execution.checkpointing.min-pause: 15 min Then I get checkpoints every 15 min instead. Can someone help me understand the docs better and configure my job? Thanks Regards , Nikola