Re: run reduceByKey on huge data in spark

2015-06-30 Thread barge.nilesh
"I 'm using 50 servers , 35 executors per server, 140GB memory per server" 35 executors *per server* sounds kind of odd to me. With 35 executors per server and server having 140gb, meaning each executor is going to get only 4gb, 4gb will be divided in to shuffle/storage memory fractions... assumi

Re: run reduceByKey on huge data in spark

2015-06-30 Thread lisendong
hello, I ‘m using spark 1.4.2-SNAPSHOT I ‘m running in yarn mode:-) I wonder if the spark.shuffle.memoryFraction or spark.shuffle.manager work? how to set these parameters... > 在 2015年7月1日,上午1:32,Ted Yu 写道: > > Which Spark release are you using ? > > Are you running in standalone mode ? > > Ch

Re: run reduceByKey on huge data in spark

2015-06-30 Thread Ted Yu
Which Spark release are you using ? Are you running in standalone mode ? Cheers On Tue, Jun 30, 2015 at 10:03 AM, hotdog wrote: > I'm running reduceByKey in spark. My program is the simplest example of > spark: > > val counts = textFile.flatMap(line => line.split(" ")).repartition(2). >