subject:"newbie question for reduce"

RE: newbie question for reduce

2022-01-27 Thread Christopher Robson

er@spark.apache.org Subject: newbie question for reduce Hello Please help take a look why my this simple reduce doesn't work? >>> rdd = sc.parallelize([("a",1),("b",2),("c",3)]) >>> >>> rdd.reduce(lambda x,y: x[1]+y[1]) Traceback (most rec

Re: newbie question for reduce

2022-01-18 Thread Sean Owen

The problem is that you are reducing a list of tuples, but you are producing an int. The resulting int can't be combined with other tuples with your function. reduce() has to produce the same type as its arguments. rdd.map(lambda x: x[1]).reduce(lambda x,y: x+y) ... would work On Tue, Jan 18, 2022

newbie question for reduce

2022-01-18 Thread capitnfrakass

Hello Please help take a look why my this simple reduce doesn't work? rdd = sc.parallelize([("a",1),("b",2),("c",3)]) rdd.reduce(lambda x,y: x[1]+y[1]) Traceback (most recent call last): File "", line 1, in File "/opt/spark/python/pyspark/rdd.py", line 1001, in reduce return reduce(f