Re: use netty shuffle for network cause high gc time

2015-01-14 Thread lihu
I used the spark1.1 On Wed, Jan 14, 2015 at 2:24 PM, Aaron Davidson wrote: > What version are you running? I think "spark.shuffle.use.netty" was a > valid option only in Spark 1.1, where the Netty stuff was strictly > experimental. Spark 1.2 contains an officially supported and much more > thoro

Re: use netty shuffle for network cause high gc time

2015-01-13 Thread Aaron Davidson
What version are you running? I think "spark.shuffle.use.netty" was a valid option only in Spark 1.1, where the Netty stuff was strictly experimental. Spark 1.2 contains an officially supported and much more thoroughly tested version under the property "spark.shuffle.blockTransferService", which is

Re: use netty shuffle for network cause high gc time

2015-01-13 Thread Andrew Ash
To confirm, lihu, are you using Spark version 1.2.0 ? On Tue, Jan 13, 2015 at 9:26 PM, lihu wrote: > Hi, > I just test groupByKey method on a 100GB data, the cluster is 20 > machine, each with 125GB RAM. > > At first I set conf.set("spark.shuffle.use.netty", "false") and run > the expe