Hi Yun and Till, Thank you for your response. For @Yun 1. No, I just renamed the checkpoint directory name since the directory name contains company data. Sorry for the confusion. 2. Yes, I set
state.checkpoints.num-retained: 10 state.backend.rocksdb.predefined-options: FLASH_SSD_OPTIMIZED In flink.conf I was expecting, shared folder will no longer contains outdated state, since my TTL is set to 30 mins, I shouldn't have seen date older than 1 day. However I could still see those outdated data in shared folder For example, current time is 2019-11-06 03:58:00 UTC, I could see following file on HDFS 65.1 M 2019-11-04 17:58 /flink/checkpoint/c344b61c456af743e4568a70b626837b/shared/03dea380-758b-4d52-b335-5e6318ba6c40 2.1 K 2019-11-04 17:28 /flink/checkpoint/c344b61c456af743e4568a70b626837b/shared/1205f112-f5ba-4516-ae32-1424afda08ac 65.1 M 2019-11-04 17:58 /flink/checkpoint/c344b61c456af743e4568a70b626837b/shared/2298e34d-8cdc-4f8a-aac0-76cf4b9ac0f5 65.1 M 2019-11-04 17:58 /flink/checkpoint/c344b61c456af743e4568a70b626837b/shared/25e58576-f86f-4ac9-83b8-08ce0be036c4 65.1 M 2019-11-05 17:42 /flink/checkpoint/c344b61c456af743e4568a70b626837b/shared/27031a93-3ae5-4247-a751-62552c29f325 3.I actually mean that, only latest 10 checkpoint containing full state will be retained on HDFS. In my case, around 20G for each checkpoint. In such way I could have control on how much data was stored on HDFS, Rather than having a increasing shared folder. But it takes a lot of time to store full state on HDFS. Thus I would still like to use incremental. For @Till I would have a try on cleanupInRocksdbCompactFilter to see if it works. Thank you. On Wed, 6 Nov 2019 at 01:50, Yun Tang <myas...@live.com> wrote: > @Till Rohrmann , I think just set `cleanupInBackground()` should be enough > for RocksDB to clean up in compaction filter after Flink-1.9.0 [1] > > @Shuwen , I have several questions for your behavior: > 1. Is the ` flink-chk743e4568a70b626837b` real folder for checkpoints? I > don't think a job-id would act like this. > 2. why you have 10 checkpoints left under checkpoint folder, did you > configure the retained checkpoints as 10? > 3. what do you mean "while I disabled incremental cleanup, the expired > full snapshot will be removed automatically." ? I cannot see that you have > configured state ttl configure as `cleanupIncrementally()`, moreover, what > is the actual meaning of "removed automatically"? > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/state/state.html#cleanup-in-background > > Best > Yun Tang > > On 11/5/19, 11:24 PM, "Till Rohrmann" <trohrm...@apache.org> wrote: > > Hi Shuwen, > > I think the problem is that you configured state ttl to clean up on > full > snapshots which aren't executed when using RocksDB with incremental > snapshots. Instead you need to activate > `cleanupInRocksdbCompactFilter`: > > val ttlConfig = StateTtlConfig > .newBuilder(Time.minutes(30) > .updateTtlOnCreateAndWrite() > .cleanupInBackground() > .cleanupInRocksdbCompactFilter() > > > .setStateVisibility(StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp) > > Cheers, > Till > > On Tue, Nov 5, 2019 at 4:04 PM shuwen zhou <jaco...@gmail.com> wrote: > > > Hi Jiayi, > > I understand that being shared folder means to store state of > multiple > > checkpoints. I think that shared folder should only retain data > across > > number “state.checkpoint.num-retained” checkpoints and remove > outdated > > checkpoint, isn't it? > > In my case I doubt that outdated checkpoint's states wasn't cleaned > up, > > which makes shared folder keep increasing even after TTL was passed. > > > > > > On Tue, 5 Nov 2019 at 21:13, bupt_ljy <bupt_...@163.com> wrote: > > > > > Hi Shuwen, > > > > > > > > > The “shared” means that the state files are shared among multiple > > > checkpoints, which happens when you enable incremental > checkpointing[1]. > > > Therefore, it’s reasonable that the size keeps growing if you set > > > “state.checkpoint.num-retained” to be a big value. > > > > > > > > > [1] > > > > > > https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html > > > > > > > > > Best, > > > Jiayi Liao > > > > > > > > > Original Message > > > Sender: shuwen zhou<jaco...@gmail.com> > > > Recipient: dev<dev@flink.apache.org> > > > Date: Tuesday, Nov 5, 2019 17:59 > > > Subject: RocksDB state on HDFS seems not being cleanned up > > > > > > > > > Hi Community, I have a job running on Flink1.9.0 on YARN with > rocksDB on > > > HDFS with incremental checkpoint enabled. I have some MapState in > code > > with > > > following config: val ttlConfig = StateTtlConfig > > > .newBuilder(Time.minutes(30) .updateTtlOnCreateAndWrite() > > > .cleanupInBackground() .cleanupFullSnapshot() > > > > > > .setStateVisibility(StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp) > > > After running for around 2 days, I observed checkpoint folder is > showing > > > 44.4 M /flink-chk743e4568a70b626837b/chk-40 65.9 M > > > /flink-chk743e4568a70b626837b/chk-41 91.7 M > > > /flink-chk743e4568a70b626837b/chk-42 96.1 M > > > /flink-chk743e4568a70b626837b/chk-43 48.1 M > > > /flink-chk743e4568a70b626837b/chk-44 71.6 M > > > /flink-chk743e4568a70b626837b/chk-45 50.9 M > > > /flink-chk743e4568a70b626837b/chk-46 90.2 M > > > /flink-chk743e4568a70b626837b/chk-37 49.3 M > > > /flink-chk743e4568a70b626837b/chk-38 96.9 M > > > /flink-chk743e4568a70b626837b/chk-39 797.9 G > > > /flink-chk743e4568a70b626837b/shared The ./shared folder size seems > > > continuing increasing and seems the folder is not being clean up. > However > > > while I disabled incremental cleanup, the expired full snapshot > will be > > > removed automatically. Is there any way to remove outdated state > on HDFS > > to > > > stop it from increasing? Thanks. -- Best Wishes, Shuwen Zhou > > > > > > > > -- > > Best Wishes, > > Shuwen Zhou > > > > > -- Best Wishes, Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>