yes, it is really a critical problem for large batch job because the unexpected failure is a common case. And we are already focusing on realizing the ideas mentioned in FLIP1, wish to contirbute to flink in months. Best, Zhijiang------------------------------------------------------------------发件人:Si-li Liu <[email protected]>发送时间:2017年2月17日(星期五) 11:22收件人:user <[email protected]>主 题:Re: Flink batch processing fault tolerance Hi, It's the reason why I gave up use Flink for my current project and pick up traditional Hadoop Framework again. 2017-02-17 10:56 GMT+08:00 Renjie Liu <[email protected]>: https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures This FLIP may help. On Thu, Feb 16, 2017 at 7:34 PM Anton Solovev <[email protected]> wrote: Hi Aljoscha, Could you share your plans of resolving it? Best,Anton From: Aljoscha Krettek [mailto:[email protected]]
Sent: Thursday, February 16, 2017 2:48 PM To: [email protected] Subject: Re: Flink batch processing fault tolerance Hi,yes, this is indeed true. We had some plans for how to resolve this but they never materialised because of the focus on Stream Processing. We might unite the two in the future and then you will get fault-tolerant batch/stream processing in the same API. Best,Aljoscha On Wed, 15 Feb 2017 at 09:28 Renjie Liu <[email protected]> wrote:Hi, all: I'm learning flink's doc and curious about the fault tolerance of batch process jobs. It seems that when one of task execution fails, the whole job will be restarted, is it true? If so, isn't it impractical to deploy large flink batch jobs? -- Liu, RenjieSoftware Engineer, MVAD-- Liu, RenjieSoftware Engineer, MVAD -- Best regards Sili Liu
