Re: How to perform reduce operation in the same order as partition indexes

ayan guha Thu, 19 May 2016 15:43:00 -0700

You can add the index from mappartitionwithindex in the output and order
based on that in merge step
On 19 May 2016 13:22, "Pulasthi Supun Wickramasinghe" <[email protected]>
wrote:


> Hi Devs/All,
>
> I am pretty new to Spark. I have a program which does some map reduce
> operations with matrices. Here *shortrddFinal* is a of type "
> *RDD[Array[Short]]"* and consists of several partitions
>
> *var BC =
> shortrddFinal.mapPartitionsWithIndex(calculateBCInternal).reduce(mergeBC)*
>
> The map function produces a "Array[Array[Double]]" and at the reduce step
> i need to merge all the 2 dimensional double arrays produced for each
> partition into one big matrix. But i also need to keep the same order as
> the partitions. that is the 2D double array produced for partition 0 should
> be the first set of rows in the matrix and then the 2d double array
> produced for partition 1 and so on. Is there a way to enforce the order in
> the reduce step.
>
> Thanks in advance
>
> Best Regards,
> Pulasthi
> --
> Pulasthi S. Wickramasinghe
> Graduate Student  | Research Assistant
> School of Informatics and Computing | Digital Science Center
> Indiana University, Bloomington
> cell: 224-386-9035
>

Re: How to perform reduce operation in the same order as partition indexes

Reply via email to