Hi Yun, Sorry for the late reply - I was doing some reading. As far as i understand, when incremental checkpointing is enabled, the reported checkpoint size(metrics/UI) is only the size of the deltas and not the full state size. I understand that compaction may not get triggered. But, if we are creating a fixed amount of state every checkpoint interval, shouldn't the reported checkpoint size remain the same(as it is a delta)?
Thanks Sudharsan On Tue, Oct 13, 2020 at 11:34 PM Yun Tang <myas...@live.com> wrote: > Hi > > This difference of data size of incremental vs full checkpoint is due to > the different implementations. > The incremental checkpoint strategy upload binary sst files while full > checkpoint strategy scans the DB and write all kv entries to external DFS. > > As your state size is really small (only 200 KB), I think your RocksDB has > not ever triggered compaction to reduce sst files, that's why the size > constantly increase. > > Best > Yun Tang > ------------------------------ > *From:* sudranga <sud.r...@gmail.com> > *Sent:* Wednesday, October 14, 2020 10:40 > *To:* user@flink.apache.org <user@flink.apache.org> > *Subject:* Rocksdb - Incremental vs full checkpoints > > Hi, > I have an event-window pipeline which handles a fixed number of messages > per > second for a fixed number of keys. When i have rocksdb as the state backend > with incremental checkpoints, i see the delta checkpoint size constantly > increase. Please see > < > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t2790/Screen_Shot_2020-10-13_at_6.png> > > > I turned off incremental checkpoints and all the checkpoints are 64kb > (There > appears to be no state leak in user code or otherwise). It is not clear why > the incremental checkpoints keep increasing in size. Perhaps, the > incremental checkpoints are not incremental(for this small state size) and > are simply full state appended to full state and so on... > > From some posts on this forum, I understand the use case for incremental > checkpoints is designed when the state size is fairly large (Gbs-Tbs) and > where the changes in state are minimal across checkpoints. However, does > this mean that we should not enable incremental checkpointing for use cases > where the state size is much smaller? Would the 'constantly' increasing > snapshot delta size reduce at some point? I don't see any compaction runs > happening > > (taskmanager_job_task_operator_column_family_rocksdb.num-running-compactions). > Not sure if that is what I am missing... > > Thanks > Sudharsan > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >