Here is a python code, I am sure you'd get the drift. Basically you need to
implement 2 functions: seq and comb in order to partial and final
operations.
def addtup(t1,t2):
j=()
for k,v in enumerate(t1):
j=j+(t1[k]+t2[k],)
return j
def seq(tIntrm,tNext):
return addtup(tIntrm,tNext)
def comb(tP,tF):
return addtup(tP,tF)
lst = [(2553,(0,0,0,1,0,0,0,0)),
(46551,(0,1,0,0,0,0,0,0)),
(266,(0,1,0,0,0,0,0,0)),
(2553,(0,0,0,0,0,1,0,0)),
(225546,(0,0,0,0,0,1,0,0)),
(225546,(0,0,0,0,0,1,0,0))]
base = sc.parallelize(lst)
res = base.aggregateByKey((0,0,0,0,0,0,0,0),seq,comb)
for i in res.collect():
print i
Result:
(266, (0, 1, 0, 0, 0, 0, 0, 0))
(225546, (0, 0, 0, 0, 0, 2, 0, 0))
(2553, (0, 0, 0, 1, 0, 1, 0, 0))
(46551, (0, 1, 0, 0, 0, 0, 0, 0))
On Thu, May 14, 2015 at 11:40 PM, Yasemin Kaya <[email protected]> wrote:
> Hi,
>
> I have JavaPairRDD<String, String> and I want to implement reduceByKey
> method.
>
> My pairRDD :
> *2553: 0,0,0,1,0,0,0,0*
> 46551: 0,1,0,0,0,0,0,0
> 266: 0,1,0,0,0,0,0,0
> *2553: 0,0,0,0,0,1,0,0*
>
> *225546: 0,0,0,0,0,1,0,0*
> *225546: 0,0,0,0,0,1,0,0*
>
> I want to get :
> *2553: 0,0,0,1,0,1,0,0*
> 46551: 0,1,0,0,0,0,0,0
> 266: 0,1,0,0,0,0,0,0
> *225546: 0,0,0,0,0,2,0,0*
>
> Anyone can help me getting that?
> Thank you.
>
> Have a nice day.
> yasemin
>
> --
> hiç ender hiç
>
--
Best Regards,
Ayan Guha