Re: [DISCUSS] Change some default config values of blocking shuffle

2022-01-06 Thread Yingjie Cao
Hi all, Thanks very much for all of the feedbacks. It seems that we have reached a consensus. I will start a vote soon. Best, Yingjie Yun Gao 于2022年1月5日周三 16:08写道: > Very thanks @Yingjie for completing the experiments! > > Also +1 for changing the default config values. From the experiments, >

Re: [DISCUSS] Change some default config values of blocking shuffle

2022-01-05 Thread Yun Gao
Very thanks @Yingjie for completing the experiments! Also +1 for changing the default config values. From the experiments, Changing the default config values would largely increase the open box experience of the flink batch, thus it seems worth changing from my side even if it would cause some c

Re: [DISCUSS] Change some default config values of blocking shuffle

2022-01-04 Thread 刘建刚
Thanks for the experiment. +1 for the changes. Yingjie Cao 于2022年1月4日周二 17:35写道: > Hi all, > > After running some tests with the proposed default value ( > taskmanager.network.sort-shuffle.min-parallelism: 1, > taskmanager.network.sort-shuffle.min-buffers: 512, > taskmanager.memory.framework.off

Re: [DISCUSS] Change some default config values of blocking shuffle

2022-01-04 Thread Yingjie Cao
Hi all, After running some tests with the proposed default value ( taskmanager.network.sort-shuffle.min-parallelism: 1, taskmanager.network.sort-shuffle.min-buffers: 512, taskmanager.memory.framework.off-heap.batch-shuffle.size: 64m, taskmanager.network.blocking-shuffle.compression.enabled: true),

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-14 Thread Yingjie Cao
Hi Till, Thanks for the suggestion. I think it makes a lot of sense to also extend the documentation for the sort shuffle to include a tuning guide. Best, Yingjie Till Rohrmann 于2021年12月14日周二 18:57写道: > As part of this FLIP, does it make sense to also extend the documentation > for the sort sh

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-14 Thread Yun Gao
Hi, > I think setting taskmanager.network.sort-shuffle.min-parallelism to 1 and > using sort-shuffle for all cases by default is a good suggestion. I am not > choosing this value mainly because two reasons: > 1. The first one is that it increases the usage of network memory which may > cause

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-14 Thread Till Rohrmann
As part of this FLIP, does it make sense to also extend the documentation for the sort shuffle [1] to include a tuning guide? I am thinking of a more in depth description of what things you might observe and how to influence them with the configuration options. [1] https://nightlies.apache.org/fli

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-13 Thread Jingsong Li
Hi Yingjie, Thanks for your explanation. I have no more questions. +1 On Tue, Dec 14, 2021 at 3:31 PM Yingjie Cao wrote: > > Hi Jingsong, > > Thanks for your feedback. > > >>> My question is, what is the maximum parallelism a job can have with the > >>> default configuration? (Does this break o

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-13 Thread Yingjie Cao
Hi Jingsong, Thanks for your feedback. >>> My question is, what is the maximum parallelism a job can have with the default configuration? (Does this break out of the box) Yes, you are right, these two options are related to network memory and framework off-heap memory. Generally, these changes w

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-13 Thread Jingsong Li
Hi Yingjie, +1 for this FLIP. I'm pretty sure this will greatly improve the ease of batch jobs. Looks like "taskmanager.memory.framework.off-heap.batch-shuffle.size" and "taskmanager.network.sort-shuffle.min-buffers" are related to network memory and framework.off-heap.size. My question is, wha

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-13 Thread Yingjie Cao
Hi Jiangang, Thanks for your suggestion. >>> The config can affect the memory usage. Will the related memory configs be changed? I think we will not change the default network memory settings. My best expectation is that the default value can work for most cases (though may not the best) and for

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-13 Thread Yingjie Cao
Hi Yun, Thanks for your feedback. I think setting taskmanager.network.sort-shuffle.min-parallelism to 1 and using sort-shuffle for all cases by default is a good suggestion. I am not choosing this value mainly because two reasons: 1. The first one is that it increases the usage of network memory

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-10 Thread 刘建刚
Glad to see the suggestion. In our test, we found that small jobs with the changing configs can not improve the performance much just as your test. I have some suggestions: - The config can affect the memory usage. Will the related memory configs be changed? - Can you share the tpcds resu

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-10 Thread Yun Gao
Hi Yingjie, Very thanks for drafting the FLIP and initiating the discussion! May I have a double confirmation for taskmanager.network.sort-shuffle.min-parallelism that since other frameworks like Spark have used sort-based shuffle for all the cases, does our current circumstance still have dif

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-10 Thread Yingjie Cao
Hi dev & users: I have created a FLIP [1] for it, feedbacks are highly appreciated. Best, Yingjie [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-199%3A+Change+some+default+config+values+of+blocking+shuffle+for+better+usability Yingjie Cao 于2021年12月3日周五 17:02写道: > Hi dev & users, >

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-06 Thread Yingjie Cao
Hi Till, Thanks for your feedback. >>> How will our tests be affected by these changes? Will Flink require more resources and, thus, will it risk destabilizing our testing infrastructure? There are some tests that need to be adjusted, for example, BlockingShuffleITCase. For other tests, theoreti

Re: [DISCUSS] Change some default config values of blocking shuffle

2021-12-03 Thread Till Rohrmann
Thanks for starting this discussion Yingjie, How will our tests be affected by these changes? Will Flink require more resources and, thus, will it risk destabilizing our testing infrastructure? I would propose to create a FLIP for these changes since you propose to change the default behaviour. I