Re: How to perform reduce operation in the same order as partition indexes

Pulasthi Supun Wickramasinghe Thu, 19 May 2016 15:53:58 -0700

Hi Ayan,

Thanks for the reply. Yes that is what i am currently doing. I thought
there may be a more efficient way provided by spark that i could use
directly.


Best Regards,
Pulasthi

On Thu, May 19, 2016 at 6:42 PM, ayan guha <[email protected]> wrote:

> You can add the index from mappartitionwithindex in the output and order
> based on that in merge step
> On 19 May 2016 13:22, "Pulasthi Supun Wickramasinghe" <
> [email protected]> wrote:
>
>> Hi Devs/All,
>>
>> I am pretty new to Spark. I have a program which does some map reduce
>> operations with matrices. Here *shortrddFinal* is a of type "
>> *RDD[Array[Short]]"* and consists of several partitions
>>
>> *var BC =
>> shortrddFinal.mapPartitionsWithIndex(calculateBCInternal).reduce(mergeBC)*
>>
>> The map function produces a "Array[Array[Double]]" and at the reduce step
>> i need to merge all the 2 dimensional double arrays produced for each
>> partition into one big matrix. But i also need to keep the same order as
>> the partitions. that is the 2D double array produced for partition 0 should
>> be the first set of rows in the matrix and then the 2d double array
>> produced for partition 1 and so on. Is there a way to enforce the order in
>> the reduce step.
>>
>> Thanks in advance
>>
>> Best Regards,
>> Pulasthi
>> --
>> Pulasthi S. Wickramasinghe
>> Graduate Student  | Research Assistant
>> School of Informatics and Computing | Digital Science Center
>> Indiana University, Bloomington
>> cell: 224-386-9035
>>
>


-- 
Pulasthi S. Wickramasinghe
Graduate Student  | Research Assistant
School of Informatics and Computing | Digital Science Center
Indiana University, Bloomington
cell: 224-386-9035

Re: How to perform reduce operation in the same order as partition indexes

Reply via email to