Re: Spark - Partitions

Chetan Khatri Thu, 12 Oct 2017 21:07:34 -0700

Use repartition
On 13-Oct-2017 9:35 AM, "KhajaAsmath Mohammed" <mdkhajaasm...@gmail.com>
wrote:


> Hi,
>
> I am reading hive query and wiriting the data back into hive after doing
> some transformations.
>
> I have changed setting spark.sql.shuffle.partitions to 2000 and since then
> job completes fast but the main problem is I am getting 2000 files for each
> partition
> size of file is 10 MB .
>
> is there a way to get same performance but write lesser number of files ?
>
> I am trying repartition now but would like to know if there are any other
> options.
>
> Thanks,
> Asmath
>

Re: Spark - Partitions

Reply via email to