Hi Stefan, Thanks for your reply. Very interesting ideas! If I understand correctly, SharedStateRegistry will still be responsible for pruning the old state; for that, it will maintain some (ordered) mapping between StateMaps and their versions, per key group. I think one modification to this approach is needed to support journaling: for each entry, maintain a version when it was last fully snapshotted; and use this version to find the minimum as you described above. I'm considering a better state cleanup and optimization of removals as the next step. Anyway, I will add it to the FLIP document.
Thanks! Regards, Roman On Tue, Nov 10, 2020 at 12:04 AM Stefan Richter <stefanrichte...@gmail.com> wrote: > Hi, > > Very happy to see that the incremental checkpoint idea is finally becoming > a reality for the heap backend! Overall the proposal looks pretty good to > me. Just wanted to point out one possible improvement from what I can still > remember from my ideas back then: I think you can avoid doing periodic full > snapshots for consolidation. Instead, my suggestion would be to track the > version numbers you encounter while you iterate a snapshot for writing it - > and then you should be able to prune all incremental snapshots that were > performed with a version number smaller than the minimum you find. To avoid > the problem of very old entries that never get modified you could start > spilling entries with a certain age-difference compared to the current map > version so that eventually all entries for an old version are re-written to > newer snapshots. You can track the version up to which this was done in the > map and then you can again let go of their corresponding snapshots after a > guaranteed time.So instead of having the burden of periodic large > snapshots, you can make every snapshot work a little bit on the cleanup and > if you are lucky it might happen mostly by itself if most entries are > frequently updated. I would also consider to make map clean a special event > in your log and consider unticking the versions on this event - this allows > you to let go of old snapshots and saves you from writing a log of > antimatter entries. Maybe the ideas are still useful to you. > > Best, > Stefan > > On 2020/11/04 01:54:25, Khachatryan Roman <k...@gmail.com> wrote: > > Hi devs,> > > > > I'd like to start a discussion of FLIP-151: Incremental snapshots for> > > heap-based state backend [1]> > > > > Heap backend, while being limited state sizes fitting into memory, also > has> > > some advantages compared to RocksDB backend:> > > 1. Serialization once per checkpoint, not per state modification. This> > > allows to “squash” updates to the same keys> > > 2. Shorter synchronous phase (compared to RocksDB incremental)> > > 3. No need for sorting and compaction, no IO amplification and JNI > overhead> > > This can potentially give higher throughput and efficiency.> > > > > However, Heap backend currently lacks incremental checkpoints. This > FLIP> > > aims to add initial support for them.> > > > > [1]> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-151%3A+Incremental+snapshots+for+heap-based+state+backend> > > > > > > > Any feedback highly appreciated.> > > > > Regards,> > > Roman> > >