Spark: Using "node-local" files within functions?

2015-04-14 Thread Horsmann, Tobias
Hi, I am trying to use Spark in combination with Yarn with 3rd party code which is unaware of distributed file systems. Providing hdfs file references thus does not work. My idea to resolve this issue was the following: Within a function I take the HDFS file reference I get as parameter and co

Which OS for Spark cluster nodes?

2015-04-03 Thread Horsmann, Tobias
Hi, Are there any recommendations for operating systems that one should use for setting up Spark/Hadoop nodes in general? I am not familiar with the differences between the various linux distributions or how well they are (not) suited for cluster set-ups, so I wondered if there is some preferred

Re: Spark throws rsync: change_dir errors on startup

2015-04-02 Thread Horsmann, Tobias
ry editing your sbin/spark-daemon.sh file, look for rsync inside the file, add -v along with that command to see what exactly i going wrong. Thanks Best Regards On Wed, Apr 1, 2015 at 7:25 PM, Horsmann, Tobias mailto:tobias.horsm...@uni-due.de>> wrote: Hi, I try to set up a minimal 2-node sp

Spark throws rsync: change_dir errors on startup

2015-04-01 Thread Horsmann, Tobias
Hi, I try to set up a minimal 2-node spark cluster for testing purposes. When I start the cluster with start-all.sh I get a rsync error message: rsync: change_dir "/usr/local/spark130/sbin//right" failed: No such file or directory (2) rsync error: some files/attrs were not transferred (see prev