subject:"Re\: Persist Dataframe to HDFS considering HDFS Block Size."

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-21 Thread Shivam Sharma

at 12:47 AM Hichame El Khalfi >>> wrote: >>> >>>> You can do this in 2 passes (not one) >>>> A) save you dataset into hdfs with what you have. >>>> B) calculate number of partition, n= (size of your dataset)/hdfs block >>>> siz

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-21 Thread Arnaud LARROQUE

rtition, n= (size of your dataset)/hdfs block >>> size >>> Then run simple spark job to read and partition based on 'n'. >>> >>> Hichame >>> >>> *From:* felixcheun...@hotmail.com >>> *Sent:* January 19, 2019 2:06 PM >>> *T

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-21 Thread Shivam Sharma

e spark job to read and partition based on 'n'. >> >> Hichame >> >> *From:* felixcheun...@hotmail.com >> *Sent:* January 19, 2019 2:06 PM >> *To:* 28shivamsha...@gmail.com; user@spark.apache.org >> *Subject:* Re: Persist Dataframe to HDFS considering H

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-19 Thread Hichame El Khalfi

19 2:06 PM To: 28shivamsha...@gmail.com; user@spark.apache.org Subject: Re: Persist Dataframe to HDFS considering HDFS Block Size. You can call coalesce to combine partitions.. From: Shivam Sharma <28shivamsha...@gmail.com> Sent: Saturday, January 19, 2019

Re: Persist Dataframe to HDFS considering HDFS Block Size.

2019-01-19 Thread Felix Cheung

You can call coalesce to combine partitions.. From: Shivam Sharma <28shivamsha...@gmail.com> Sent: Saturday, January 19, 2019 7:43 AM To: user@spark.apache.org Subject: Persist Dataframe to HDFS considering HDFS Block Size. Hi All, I wanted to persist dataframe

Re: Persist Dataframe to HDFS considering HDFS Block Size.

Re: Persist Dataframe to HDFS considering HDFS Block Size.

Re: Persist Dataframe to HDFS considering HDFS Block Size.

Re: Persist Dataframe to HDFS considering HDFS Block Size.

Re: Persist Dataframe to HDFS considering HDFS Block Size.

5 matches

Site Navigation

Mail list logo

Footer information