Hi, im trying to run an external script on spark using rdd.pipe() and
although it runs successfully on standalone, it throws an error on cluster.
The error comes from the executors and it's : "Cannot run program
"path/to/program": error=2, No such file or directory".

Does the external script need to be available on all nodes in the cluster
when using rdd.pipe()?

What if i don't have permission to install anything on the nodes of the
cluster? Is there any other way to make the script available to the worker
nodes?

(The external script is loaded in HDFS and is passed to the driver class
through args)



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to