PTAL:
http://stackoverflow.com/questions/29213404/how-to-split-an-rdd-into-multiple-smaller-rdds-given-a-max-number-of-rows-per

-Sahil

On Thu, Dec 3, 2015 at 9:18 AM, Ram VISWANADHA <
ram.viswana...@dailymotion.com> wrote:

> Yes. That did not help.
>
> Best Regards,
> Ram
> From: Ted Yu <yuzhih...@gmail.com>
> Date: Wednesday, December 2, 2015 at 3:25 PM
> To: Ram VISWANADHA <ram.viswana...@dailymotion.com>
> Cc: user <user@spark.apache.org>
> Subject: Re: Improve saveAsTextFile performance
>
> Have you tried calling coalesce() before saveAsTextFile ?
>
> Cheers
>
> On Wed, Dec 2, 2015 at 3:15 PM, Ram VISWANADHA <
> ram.viswana...@dailymotion.com> wrote:
>
>> JavaRDD.saveAsTextFile is taking a long time to succeed. There are 10
>> tasks, the first 9 complete in a reasonable time but the last task is
>> taking a long time to complete. The last task contains the maximum number
>> of records like 90% of the total number of records.  Is there any way to
>> parallelize the execution by increasing the number of tasks or evenly
>> distributing the number of records to different tasks?
>>
>> Thanks in advance.
>>
>> Best Regards,
>> Ram
>>
>
>

Reply via email to