I think this may be also due to the fact that I have multiple copies of Python. My driver program was using Python3.4.2 My local slave nodes are using Python3.4.4 (System administrator's version)
On Fri, Feb 12, 2016 at 5:51 PM, Zheng Wendell <zhengwend...@gmail.com> wrote: > Sorry, I can no longer reproduce the error. > After upgrading Python3.4.2 to Python 3.4.4, the error disappears. > > Spark release: spark-1.6.0-bin-hadoop2.6 > code snippet: > ``` > lines = sc.parallelize([5,6,2,8,5,2,4,9,2,1,7,3,4,1,5,8,7,6]) > pairs = lines.map(lambda x: (x, 1)) > counts = pairs.reduceByKey(lambda a, b: a + b) > counts.collect() > ``` > > On Fri, Feb 12, 2016 at 4:26 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Can you give a bit more information ? >> >> release of Spark you use >> full error trace >> your code snippet >> >> Thanks >> >> On Fri, Feb 12, 2016 at 7:22 AM, Sisyphuss <zhengwend...@gmail.com> >> wrote: >> >>> When trying the `reduceByKey` transformation on Python3.4, I got the >>> following error: >>> >>> ImportError: No module named 'UserString' >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Python3-does-not-have-Module-UserString-tp26212.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >