Thanks for starting this discussion Yingjie, How will our tests be affected by these changes? Will Flink require more resources and, thus, will it risk destabilizing our testing infrastructure?
I would propose to create a FLIP for these changes since you propose to change the default behaviour. It can be a very short one, though. Cheers, Till On Fri, Dec 3, 2021 at 10:02 AM Yingjie Cao <kevin.ying...@gmail.com> wrote: > Hi dev & users, > > We propose to change some default values of blocking shuffle to improve > the user out-of-box experience (not influence streaming). The default > values we want to change are as follows: > > 1. Data compression > (taskmanager.network.blocking-shuffle.compression.enabled): Currently, the > default value is 'false'. Usually, data compression can reduce both disk > and network IO which is good for performance. At the same time, it can save > storage space. We propose to change the default value to true. > > 2. Default shuffle implementation > (taskmanager.network.sort-shuffle.min-parallelism): Currently, the default > value is 'Integer.MAX', which means by default, Flink jobs will always use > hash-shuffle. In fact, for high parallelism, sort-shuffle is better for > both stability and performance. So we propose to reduce the default value > to a proper smaller one, for example, 128. (We tested 128, 256, 512 and > 1024 with a tpc-ds and 128 is the best one.) > > 3. Read buffer of sort-shuffle > (taskmanager.memory.framework.off-heap.batch-shuffle.size): Currently, the > default value is '32M'. Previously, when choosing the default value, both > β32M' and '64M' are OK for tests and we chose the smaller one in a cautious > way. However, recently, it is reported in the mailing list that the default > value is not enough which caused a buffer request timeout issue. We already > created a ticket to improve the behavior. At the same time, we propose to > increase this default value to '64M' which can also help. > > 4. Sort buffer size of sort-shuffle > (taskmanager.network.sort-shuffle.min-buffers): Currently, the default > value is '64' which means '64' network buffers (32k per buffer by default). > This default value is quite modest and the performance can be influenced. > We propose to increase this value to a larger one, for example, 512 (the > default TM and network buffer configuration can serve more than 10 > result partitions concurrently). > > We already tested these default values together with tpc-ds benchmark in a > cluster and both the performance and stability improved a lot. These > changes can help to improve the out-of-box experience of blocking shuffle. > What do you think about these changes? Is there any concern? If there are > no objections, I will make these changes soon. > > Best, > Yingjie >