Thanks for preparing this FLIP, Weijie.

I think this is a good improvement on batch resource elasticity. Looking
forward to the community feedback.

Best,

Xintong



On Thu, May 19, 2022 at 2:31 PM weijie guo <guoweijieres...@gmail.com>
wrote:

> Hi all,
>
>
> I’d like to start a discussion about FLIP-235[1], which introduce a new 
> shuffle mode
>  can overcome some of the problems of Pipelined Shuffle and Blocking Shuffle 
> in batch scenarios.
>
>
> Currently in Flink, task scheduling is more or less constrained by the 
> shuffle implementations.
> This will bring the following disadvantages:
>
>    1. Pipelined Shuffle:
>     For pipelined shuffle, the upstream and downstream tasks are required to 
> be deployed at the same time, to avoid upstream tasks being blocked forever. 
> This is fine when there are enough resources for both upstream and downstream 
> tasks to run simultaneously, but will cause the following problems otherwise:
>    1.
>       Pipelined shuffle connected tasks (i.e., a pipelined region) cannot be 
> executed until obtaining resources for all of them, resulting in longer job 
> finishing time and poorer resource efficiency due to holding part of the 
> resources idle while waiting for the rest.
>       2.
>       More severely, if multiple jobs each hold part of the cluster resources 
> and are waiting for more, a deadlock would occur. The chance is not trivial, 
> especially for scenarios such as OLAP where concurrent job submissions are 
> frequent.
>       2. Blocking Shuffle:
>     For blocking shuffle, execution of downstream tasks must wait for all 
> upstream tasks to finish, despite there might be more resources available. 
> The sequential execution of upstream and downstream tasks significantly 
> increase the job finishing time, and the disk IO workload for spilling and 
> loading full intermediate data also affects the performance.
>
>
> We believe the root cause of the above problems is that shuffle 
> implementations put unnecessary constraints on task scheduling.
>
>
> To solve this problem, Xintong Song and I propose to introduce hybrid shuffle 
> to minimize the scheduling constraints. With Hybrid Shuffle, Flink should:
>
>    1. Make best use of available resources.
>     Ideally, we want Flink to always make progress if possible. That is to 
> say, it should always execute a pending task if there are resources available 
> for that task.
>    2. Minimize disk IO load.
>     In-flight data should be consumed directly from memory as much as 
> possible. Only data that is not consumed timely should be spilled to disk.
>
> You can find more details in FLIP-235. Looking forward to your feedback.
>
>
> [1]
>
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-235%3A+Hybrid+Shuffle+Mode
>
>
>
> Best regards,
>
> Weijie
>

Reply via email to