Assuming you run spark locally (ie either local mode or standalone cluster on your localm/c) 1. You need to have hadoop binaries locally 2. You need to have hdfs-site on Spark Classpath of your local m/c
I would suggest you to start off with local files to play around. If you need to run spark on CDH cluster using Yarn, then you need to use spark-submit to yarn cluster. You can see a very good example here: https://spark.apache.org/docs/latest/running-on-yarn.html On Wed, Jul 15, 2015 at 10:36 PM, Jeskanen, Elina <elina.jeska...@cgi.com> wrote: > I have Spark 1.4 on my local machine and I would like to connect to our > local 4 nodes Cloudera cluster. But how? > > > > In the example it says text_file = spark.textFile("hdfs://..."), but can > you advise me in where to get this "hdfs://..." -address? > > > > Thanks! > > > > Elina > > > > > -- Best Regards, Ayan Guha