Re: Re: repartitionAndSortWithinPartitions task shuffle phase is very slow

2015-10-30 Thread Luke Han
Would love to have any suggestion or comments about our implementation. Is there anyone who has such experience? Thanks. Best Regards! - Luke Han On Tue, Oct 27, 2015 at 10:33 AM, 周千昊 wrote: > I have replace default java serialization with Kyro. > It indeed reduce the sh

Re: Re: repartitionAndSortWithinPartitions task shuffle phase is very slow

2015-10-26 Thread 周千昊
I have replace default java serialization with Kyro. It indeed reduce the shuffle size and the performance has been improved, however the shuffle speed remains unchanged. I am quite newbie to Spark, does anyone have idea about towards which direction I should go to find the root cause? 周千昊 于2015年1

Re: Re: repartitionAndSortWithinPartitions task shuffle phase is very slow

2015-10-23 Thread 周千昊
We have not tried that yet, however both implementations on MR and spark are tested on the same amount of partition and same cluster 250635...@qq.com <250635...@qq.com>于2015年10月23日周五 下午5:21写道: > Hi, > > Not an expert on this kind of implementation. But referring to the > performance result, > > i