subject:"Spark DataFrame Reduce Job Took 40s for 6000 Rows"

Re: Spark DataFrame Reduce Job Took 40s for 6000 Rows

2015-06-15 Thread Todd Nist

k.apache.org" > Date: 06/15/2015 03:02 PM > Subject:Re: Spark DataFrame Reduce Job Took 40s for 6000 Rows > -- > > > > Have a look here *https://spark.apache.org/docs/latest/tuning.html* > <https://spark.apache.org/docs/latest/tunin

Re: Spark DataFrame Reduce Job Took 40s for 6000 Rows

2015-06-15 Thread Proust GZ Feng

Is there any additional idea? Thanks a lot. Proust From: Akhil Das To: Proust GZ Feng/China/IBM@IBMCN Cc: "user@spark.apache.org" Date: 06/15/2015 03:02 PM Subject: Re: Spark DataFrame Reduce Job Took 40s for 6000 Rows Have a look here https://spark.apach

Re: Spark DataFrame Reduce Job Took 40s for 6000 Rows

2015-06-15 Thread Akhil Das

Have a look here https://spark.apache.org/docs/latest/tuning.html Thanks Best Regards On Mon, Jun 15, 2015 at 11:27 AM, Proust GZ Feng wrote: > Hi, Spark Experts > > I have played with Spark several weeks, after some time testing, a reduce > operation of DataFrame cost 40s on a cluster with 5 d

Spark DataFrame Reduce Job Took 40s for 6000 Rows

2015-06-14 Thread Proust GZ Feng

Hi, Spark Experts I have played with Spark several weeks, after some time testing, a reduce operation of DataFrame cost 40s on a cluster with 5 datanode executors. And the back-end rows is about 6,000, is this a normal case? Such performance looks too bad because in Java a loop for 6,000 rows ca