Hi Yun,
Sorry for the late reply - I was doing some reading. As far as i
understand, when incremental checkpointing is enabled, the reported
checkpoint size(metrics/UI) is only the size of the deltas and not the full
state size. I understand that compaction may not get triggered. But, if we
are creating a fixed amount of state every checkpoint interval,
shouldn't the reported checkpoint size remain the same(as it is a delta)?


Thanks

Sudharsan

On Tue, Oct 13, 2020 at 11:34 PM Yun Tang <myas...@live.com> wrote:

> Hi
>
> This difference of data size of incremental vs full checkpoint is due to
> the different implementations.
> The incremental checkpoint strategy upload binary sst files while full
> checkpoint strategy scans the DB and write all kv entries to external DFS.
>
> As your state size is really small (only 200 KB), I think your RocksDB has
> not ever triggered compaction to reduce sst files, that's why the size
> constantly increase.
>
> Best
> Yun Tang
> ------------------------------
> *From:* sudranga <sud.r...@gmail.com>
> *Sent:* Wednesday, October 14, 2020 10:40
> *To:* user@flink.apache.org <user@flink.apache.org>
> *Subject:* Rocksdb - Incremental vs full checkpoints
>
> Hi,
> I have an event-window pipeline which handles a fixed number of messages
> per
> second for a fixed number of keys. When i have rocksdb as the state backend
> with incremental checkpoints, i see the delta checkpoint size constantly
> increase. Please see
> <
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t2790/Screen_Shot_2020-10-13_at_6.png>
>
>
> I turned off incremental checkpoints and all the checkpoints are 64kb
> (There
> appears to be no state leak in user code or otherwise). It is not clear why
> the incremental checkpoints keep increasing in size. Perhaps, the
> incremental checkpoints are not incremental(for this small state size) and
> are simply full state appended to full state and so on...
>
> From some posts on this forum, I understand the use case for incremental
> checkpoints is designed when the state size is fairly large (Gbs-Tbs) and
> where the changes in state are minimal across checkpoints. However, does
> this mean that we should not enable incremental checkpointing for use cases
> where the state size is much smaller? Would the 'constantly' increasing
> snapshot delta size reduce at some point?  I don't see any compaction runs
> happening
>
> (taskmanager_job_task_operator_column_family_rocksdb.num-running-compactions).
> Not sure if that is what I am missing...
>
> Thanks
> Sudharsan
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Reply via email to