Hi Xiaogang, It is an interesting topic.
Notice that there is some effort to build a mature mllib of flink these days, it could be also possible for some ml cases trade off correctness for timeliness or throughput. Excatly-once delivery excatly makes flink stand out but an at-most-once option would adapt flink to more scenarios. Best, tison. SHI Xiaogang <shixiaoga...@gmail.com> 于2019年6月11日周二 下午4:33写道: > Flink offers a fault-tolerance mechanism to guarantee at-least-once and > exactly-once message delivery in case of failures. The mechanism works well > in practice and makes Flink stand out among stream processing systems. > > But the guarantee on at-least-once and exactly-once delivery does not come > without price. It typically requires to restart multiple tasks and fall > back to the place where the last checkpoint is taken. (Fined-grained > recovery can help alleviate the cost, but it still needs certain efforts to > recover jobs.) > > In some senarios, users perfer quick recovery and will trade correctness > off. For example, in some online recommendation systems, timeliness is far > more important than consistency. In such cases, we can restart only those > failed tasks individually, and do not need to perform any rollback. Though > some messages delivered to failed tasks may be lost, other tasks can > continuously provide service to users. > > Many of our users are demanding for at-most-once delivery in Flink. What do > you think of the proposal? Any feedback is appreciated. > > Regards, > Xiaogang Shi >