Hi,
The reduce lambda accepts as its first argument the return value of the
previous execution. The first time, it is invoked with:
x = ("a", 1), y = ("b", 2)
And returns 1+2=3
Second time, it is invoked with
x = 3, y = ("c", 3)
so you can see why it raises the error that you are seeing.
There a
The problem is that you are reducing a list of tuples, but you are
producing an int. The resulting int can't be combined with other tuples
with your function. reduce() has to produce the same type as its arguments.
rdd.map(lambda x: x[1]).reduce(lambda x,y: x+y)
... would work
On Tue, Jan 18, 2022