yes i know about that,its in case to reduce partitions. the point here is
the data is skewed to few partitions..


On Sat, Oct 17, 2015 at 6:27 PM, Raghavendra Pandey <
raghavendra.pan...@gmail.com> wrote:

> You can use coalesce function, if you want to reduce the number of
> partitions. This one minimizes the data shuffle.
>
> -Raghav
>
> On Sat, Oct 17, 2015 at 1:02 PM, shahid qadri <shahidashr...@icloud.com>
> wrote:
>
>> Hi folks
>>
>> I need to reparation large set of data around(300G) as i see some
>> portions have large data(data skew)
>>
>> i have pairRDDs [({},{}),({},{}),({},{})]
>>
>> what is the best way to solve the the problem
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


-- 
with Regards
Shahid Ashraf

Reply via email to