Hi Roman, +1 from my side on this proposal. Also big +1 for the recent changes in this FLIP in the motivation and high level overview sections.
For me there are quite a bit of unanswered things around how to actually implement the proposed changes and especially how to integrate it with the state backends and checkpointing, but maybe we can do that in either a follow up design docs or discuss it in the tickets or even maybe some PoC. Piotrek pt., 15 sty 2021 o 07:49 Khachatryan Roman <khachatryan.ro...@gmail.com> napisaĆ(a): > Hi devs, > > I'd like to start a discussion of FLIP-158: Generalized incremental > checkpoints [1] > > FLIP motivation: > Low end-to-end latency is a much-demanded property in many Flink setups. > With exactly-once, this latency depends on checkpoint interval/duration > which in turn is defined by the slowest node (usually the one doing a full > non-incremental snapshot). In large setups with many nodes, the probability > of at least one node being slow gets higher, making almost every checkpoint > slow. > > This FLIP proposes a mechanism to deal with this by materializing and > uploading state continuously and only uploading the changed part during the > checkpoint itself. It differs from other approaches in that 1) checkpoints > are always incremental; 2) works for any state backend. > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-158%3A+Generalized+incremental+checkpoints > > Any feedback highly appreciated! > > Regards, > Roman >