Thanks Xiaowei for the inspiring comments! Yes, we could increase the granularity of speculation from a single task to a bundle of successive tasks especially for the pipelined channel.
Xiaowei Jiang <xiaow...@gmail.com> 于2018年11月18日周日 下午2:24写道: > Thanks Yangyu for the nice design doc! One thing to consider is the > granularity of speculation. Multiple task may propagate data through > pipeline mode. In such case, fixing a single task may not be enough. But > you might be able to fix this problem by increasing the granularity of > speculation. The traditional case of a single speculative task can be > considered as a special case of this. > > Xiaowei > > On Sat, Nov 17, 2018 at 10:27 PM Tao Yangyu <ryantao...@gmail.com> wrote: > > > Hi all, > > > > After refined, the detailed design doc is here: > > > > > https://docs.google.com/document/d/1X_Pfo4WcO-TEZmmVTTYNn44LQg5gnFeeaeqM7ZNLQ7M/edit?usp=sharing > > > > Your kind reviews and comments are very appreciated and will help so much > > the feature to be completed. > > > > Best, > > Ryan > > > > > > Tao Yangyu <ryantao...@gmail.com> 于2018年11月7日周三 下午4:49写道: > > > > > Thanks so much for your all feedbacks! > > > > > > Yes, as mentioned above by Jin Sun, the design currently targets batch > to > > > explore the general framework and basic modules. The strategy could be > > also > > > applied to stream with some extended code, for example, the result > > > commitment. > > > > > > Jin Sun <isun...@gmail.com> 于2018年11月7日周三 上午8:38写道: > > > > > >> I think this is target for batch at the very beginning, the idea > should > > >> be also work for both case, with different algorithm/strategy. > > >> > > >> Ryan, since you are working on this, I will assign FLINK-10644 < > > >> https://issues.apache.org/jira/browse/FLINK-10644> to you. > > >> > > >> Jin > > >> > > >> > On Nov 6, 2018, at 4:45 AM, Till Rohrmann <trohrm...@apache.org> > > wrote: > > >> > > > >> > Thanks for starting this discussion Ryan. I'm looking forward to > your > > >> > design document about this feature. Quick question: Will it be a > batch > > >> only > > >> > feature? If no, then it needs to take checkpointing into account as > > >> well. > > >> > > > >> > Cheers, > > >> > Till > > >> > > > >> > On Tue, Nov 6, 2018 at 4:29 AM zhijiang <wangzhijiang...@aliyun.com > > >> .invalid> > > >> > wrote: > > >> > > > >> >> Thanks yangyu for launching this discussion. > > >> >> > > >> >> I really like this proposal. We ever found this scene frequently > that > > >> some > > >> >> long tail tasks to delay the total batch job execution time in > > >> production. > > >> >> We also have some thoughts for bringing this mechanism. Looking > > >> forward to > > >> >> your detail design doc, then we can discussion further. > > >> >> > > >> >> Best, > > >> >> Zhijiang > > >> >> ------------------------------------------------------------------ > > >> >> 发件人:Tao Yangyu <ryantao...@gmail.com> > > >> >> 发送时间:2018年11月6日(星期二) 11:01 > > >> >> 收件人:dev <dev@flink.apache.org> > > >> >> 主 题:[DISCUSS] Task speculative execution for Flink batch > > >> >> > > >> >> Hi everyone, > > >> >> > > >> >> We propose task speculative execution for Flink batch in this > message > > >> as > > >> >> follows. > > >> >> > > >> >> In the batch mode, the job is usually divided into multiple > parallel > > >> tasks > > >> >> executed cross many nodes in the cluster. It is common to encounter > > the > > >> >> performance degradation on some nodes due to hardware problems or > > >> accident > > >> >> I/O busy and high CPU load. This kind of degradation can probably > > >> cause the > > >> >> running tasks on the node to be quite slow that is so called long > > tail > > >> >> tasks. Although the long tail tasks will not fail, they can > severely > > >> affect > > >> >> the total job running time. Flink task scheduler does not take this > > >> long > > >> >> tail problem into account currently. > > >> >> > > >> >> > > >> >> > > >> >> Here we propose the speculative execution strategy to handle the > > >> problem. > > >> >> The basic idea is to run a copy of task on another node when the > > >> original > > >> >> task is identified to be long tail. In more details, the > speculative > > >> task > > >> >> will be triggered when the scheduler detects that the data > processing > > >> >> throughput of a task is much slower than others. The speculative > task > > >> is > > >> >> executed in parallel with the original one and share the same > failure > > >> retry > > >> >> mechanism. Once either task complete, the scheduler admits its > output > > >> as > > >> >> the final result and cancel the other running one. The preliminary > > >> >> experiments has demonstrated the effectiveness. > > >> >> > > >> >> > > >> >> The detailed design doc will be ready soon. Your reviews and > > comments > > >> will > > >> >> be much appreciated. > > >> >> > > >> >> > > >> >> Thanks! > > >> >> > > >> >> Ryan > > >> >> > > >> >> > > >> > > >> > > >