Re: Aggregate subsequenty x row values together.

2016-03-28 Thread sujeet jog
Hi Ted, There is no row key persey, and i actually do not want to sort , want to aggregate the subsequent x rows together as a mean value maintaing the order of the row entries, For ex : - Input rdd [ 12, 45 ] [ 14, 50 ] [ 10, 35 ] [ 11, 50 ] expected output rdd , the below is actually a aggreg

Re: Aggregate subsequenty x row values together.

2016-03-28 Thread Ted Yu
Can you describe your use case a bit more ? Since the row keys are not sorted in your example, there is a chance that you get indeterministic results when you aggregate on groups of two successive rows. Thanks On Mon, Mar 28, 2016 at 9:21 AM, sujeet jog wrote: > Hi, > > I have a RDD like this

Re: Aggregate subsequenty x row values together.

2016-03-28 Thread Alexander Krasnukhin
So, why not make a fake key and aggregate on it? On Mon, Mar 28, 2016 at 6:21 PM, sujeet jog wrote: > Hi, > > I have a RDD like this . > > [ 12, 45 ] > [ 14, 50 ] > [ 10, 35 ] > [ 11, 50 ] > > i want to aggreate values of first two rows into 1 row and subsequenty the > next two rows into anothe

Aggregate subsequenty x row values together.

2016-03-28 Thread sujeet jog
Hi, I have a RDD like this . [ 12, 45 ] [ 14, 50 ] [ 10, 35 ] [ 11, 50 ] i want to aggreate values of first two rows into 1 row and subsequenty the next two rows into another single row... i don't have a key to aggregate for using some of the aggregate pyspark functions, how to achieve it ?