This should be the case that you run different versions for Python in driver and slaves, Spark 1.4 will double check that will release soon).
SPARK_PYTHON should be PYSPARK_PYTHON On Tue, May 26, 2015 at 11:21 AM, Nikhil Muralidhar <nmural...@gmail.com> wrote: > Hello, > I am trying to run a spark job (which runs fine on the master node of the > cluster), on a HDFS hadoop cluster using YARN. When I run the job which has > a rdd.saveAsTextFile() line in it, I get the following error: > > SystemError: unknown opcode > > The entire stacktrace has been appended to this message. > > All the nodes on the cluster have Python 2.7.9 running on them including > the master and all of them have the variable SPARK_PYTHON set to the > anaconda python path. When I try pyspark-shell on these instances they use > anaconda python to open up the spark shell. > > I installed anaconda on all slaves after looking at the python version > incompatibility issues mentioned in the following post: > > > http://glennklockwood.blogspot.com/2014/06/spark-on-supercomputers-few-notes.html > > Please let me know what the issue might be. > > The spark version we are using is Spark 1.3 > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org