Re: [DISCUSS] Allow at-most-once delivery in case of failures

Zili Chen Tue, 11 Jun 2019 01:54:59 -0700

Hi Xiaogang,

It is an interesting topic.


Notice that there is some effort to build a mature mllib of flink these
days, it could be also possible for some ml cases trade off correctness for
timeliness or throughput. Excatly-once delivery excatly makes flink stand
out but an at-most-once option would adapt flink to more scenarios.

Best,
tison.


SHI Xiaogang <[email protected]> 于2019年6月11日周二 下午4:33写道：

> Flink offers a fault-tolerance mechanism to guarantee at-least-once and
> exactly-once message delivery in case of failures. The mechanism works well
> in practice and makes Flink stand out among stream processing systems.
>
> But the guarantee on at-least-once and exactly-once delivery does not come
> without price. It typically requires to restart multiple tasks and fall
> back to the place where the last checkpoint is taken. (Fined-grained
> recovery can help alleviate the cost, but it still needs certain efforts to
> recover jobs.)
>
> In some senarios, users perfer quick recovery and will trade correctness
> off. For example, in some online recommendation systems, timeliness is far
> more important than consistency. In such cases, we can restart only those
> failed tasks individually, and do not need to perform any rollback. Though
> some messages delivered to failed tasks may be lost, other tasks can
> continuously provide service to users.
>
> Many of our users are demanding for at-most-once delivery in Flink. What do
> you think of the proposal? Any feedback is appreciated.
>
> Regards,
> Xiaogang Shi
>

Re: [DISCUSS] Allow at-most-once delivery in case of failures

Reply via email to