Re: use netty shuffle for network cause high gc time

lihu Wed, 14 Jan 2015 01:00:49 -0800

I used the spark1.1

On Wed, Jan 14, 2015 at 2:24 PM, Aaron Davidson <ilike...@gmail.com> wrote:


> What version are you running? I think "spark.shuffle.use.netty" was a
> valid option only in Spark 1.1, where the Netty stuff was strictly
> experimental. Spark 1.2 contains an officially supported and much more
> thoroughly tested version under the property 
> "spark.shuffle.blockTransferService",
> which is set to netty by default.
>
> On Tue, Jan 13, 2015 at 9:26 PM, lihu <lihu...@gmail.com> wrote:
>
>> Hi,
>>      I just test groupByKey method on a 100GB data, the cluster is 20
>> machine, each with 125GB RAM.
>>
>>     At first I set  conf.set("spark.shuffle.use.netty", "false") and run
>> the experiment, and then I set conf.set("spark.shuffle.use.netty", "true")
>> again to re-run the experiment, but at the latter case, the GC time is much
>> higher。
>>
>>
>>  I thought the latter one should be better, but it is not. So when should
>> we use netty for network shuffle fetching?
>>
>>
>>
>


-- 
*Best Wishes!*

*Li Hu(李浒) | Graduate Student*

*Institute for Interdisciplinary Information Sciences(IIIS
<http://iiis.tsinghua.edu.cn/>)*
*Tsinghua University, China*

*Email: lihu...@gmail.com <lihu...@gmail.com>*
*Homepage: http://iiis.tsinghua.edu.cn/zh/lihu/
<http://iiis.tsinghua.edu.cn/zh/lihu/>*

Re: use netty shuffle for network cause high gc time

Reply via email to