You can add the index from mappartitionwithindex in the output and order based on that in merge step On 19 May 2016 13:22, "Pulasthi Supun Wickramasinghe" <[email protected]> wrote:
> Hi Devs/All, > > I am pretty new to Spark. I have a program which does some map reduce > operations with matrices. Here *shortrddFinal* is a of type " > *RDD[Array[Short]]"* and consists of several partitions > > *var BC = > shortrddFinal.mapPartitionsWithIndex(calculateBCInternal).reduce(mergeBC)* > > The map function produces a "Array[Array[Double]]" and at the reduce step > i need to merge all the 2 dimensional double arrays produced for each > partition into one big matrix. But i also need to keep the same order as > the partitions. that is the 2D double array produced for partition 0 should > be the first set of rows in the matrix and then the 2d double array > produced for partition 1 and so on. Is there a way to enforce the order in > the reduce step. > > Thanks in advance > > Best Regards, > Pulasthi > -- > Pulasthi S. Wickramasinghe > Graduate Student | Research Assistant > School of Informatics and Computing | Digital Science Center > Indiana University, Bloomington > cell: 224-386-9035 >
