Hi all,

To minimize the recovery time from failure, we employ incremental, retained
checkpoint with `state.checkpoints.num-retained

as 10` in our Flink apps. With this setting, Flink automatically creates
new checkpoints regularly and keeps only the latest 10

checkpoints. Besides, for app upgrade and better reliability, we have a
cron job which creates savepoints at regular intervals.



We have two questions for checkpoint retention.

   1. When our cron job creates a savepoint called SP, it seems those
   checkpoints created earlier SP still cannot be deleted. We thought the new
   checkpoints are generated based on SP and thus old checkpoints before SP
   will be useless. However, it seems the checkpoint mechanism doesn't work as
   we thought. Is what we thought correct?
   2. To save storage cost, we’d like to know what checkpoints can be
   deleted. Currently, each version of our app has 10 checkpoints. We wonder
   whether we can delete checkpoints generated for previous versions of our
   apps?


Any comment is appreciated!


Best wishes,

Chen-Che


An example is below. (checkpoint is generated every 30 mins while savepoint
is created every 2 hours)

1:00 Flink create checkpoint

1:30 Flink create checkpoint

2:00 Flink create checkpoint

2:30 Flink create checkpoint

3:00 Cronjob create savepoint (SP)

3:30 Flink create checkpoint

4:00 Flink create checkpoint

.

.

.

Reply via email to