subject:"Persist Dataframe to HDFS considering HDFS Block Size."

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-21 Thread Shivam Sharma

e >>>> Then run simple spark job to read and partition based on 'n'. >>>> >>>> Hichame >>>> >>>> *From:* felixcheun...@hotmail.com >>>> *Sent:* January 19, 2019 2:06 PM >>>> *To:* 28shivamsha...@gmail.com; use

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-21 Thread Arnaud LARROQUE

rtition, n= (size of your dataset)/hdfs block >>> size >>> Then run simple spark job to read and partition based on 'n'. >>> >>> Hichame >>> >>> *From:* felixcheun...@hotmail.com >>> *Sent:* January 19, 2019 2:06 PM >>> *T

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-21 Thread Shivam Sharma

e spark job to read and partition based on 'n'. >> >> Hichame >> >> *From:* felixcheun...@hotmail.com >> *Sent:* January 19, 2019 2:06 PM >> *To:* 28shivamsha...@gmail.com; user@spark.apache.org >> *Subject:* Re: Persist Dataframe to HDFS considering H

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-19 Thread Hichame El Khalfi

19 2:06 PM To: 28shivamsha...@gmail.com; user@spark.apache.org Subject: Re: Persist Dataframe to HDFS considering HDFS Block Size. You can call coalesce to combine partitions.. From: Shivam Sharma <28shivamsha...@gmail.com> Sent: Saturday, January 19, 2019

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-19 Thread Felix Cheung

You can call coalesce to combine partitions.. From: Shivam Sharma <28shivamsha...@gmail.com> Sent: Saturday, January 19, 2019 7:43 AM To: user@spark.apache.org Subject: Persist Dataframe to HDFS considering HDFS Block Size. Hi All, I wanted to persist dat

Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-19 Thread Shivam Sharma

Hi All, I wanted to persist dataframe on HDFS. Basically, I am inserting data into a HIVE table using Spark. Currently, at the time of writing to HIVE table I have set total shuffle partitions = 400 so total 400 files are being created which is not even considering HDFS block size. How can I tell

Re: Persist Dataframe to HDFS considering HDFS Block Size.

Re: Persist Dataframe to HDFS considering HDFS Block Size.

Re: Persist Dataframe to HDFS considering HDFS Block Size.

Re: Persist Dataframe to HDFS considering HDFS Block Size.

Re: Persist Dataframe to HDFS considering HDFS Block Size.

Persist Dataframe to HDFS considering HDFS Block Size.

6 matches

Site Navigation

Mail list logo

Footer information