Hi all,
I am trying to run spark with the latest build (from branch-1.2), as far as
I can see, all the paths are set and SparkContext starts up OK, however, I
cannot run anything that goes to the nodes. I get the following error:
Error from python worker:
/usr/bin/python2.7: No module named pyspark
PYTHONPATH was:
/mnt/yarn/nm/usercache/massive/filecache/15/spark-assembly-1.2.0-SNAPSHOT-hadoop2.3.0.jar
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at
org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:163)
at
org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:86)
at
org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:62)
at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:102)
at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:70)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
any idea where it is picking up this path from?
thanks,
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/No-module-named-pyspark-latest-built-tp18740.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]