Re: Spark Dataframe: Save to hdfs is taking long time

2016-12-28 Thread Raju Bairishetti
Try setting num partitions to (number of executors * number of cores) while writing to dest location. You should be very very careful while setting num partitions as incorrect number may lead to shuffle. On Fri, Dec 16, 2016 at 12:56 PM, KhajaAsmath Mohammed < mdkhajaasm...@gmail.com> wrote: > I

Re: Spark Dataframe: Save to hdfs is taking long time

2016-12-15 Thread KhajaAsmath Mohammed
I am trying to save the files as Paraquet. On Thu, Dec 15, 2016 at 10:41 PM, Felix Cheung wrote: > What is the format? > > > -- > *From:* KhajaAsmath Mohammed > *Sent:* Thursday, December 15, 2016 7:54:27 PM > *To:* user @spark > *Subject:* Spark Dataframe: Save to h

Re: Spark Dataframe: Save to hdfs is taking long time

2016-12-15 Thread Felix Cheung
What is the format? From: KhajaAsmath Mohammed Sent: Thursday, December 15, 2016 7:54:27 PM To: user @spark Subject: Spark Dataframe: Save to hdfs is taking long time Hi, I am using issue while saving the dataframe back to HDFS. It's taking long time to run.