Re: Spark and HDFS

ayan guha Wed, 15 Jul 2015 06:52:05 -0700

Assuming you run spark locally (ie either local mode or standalone cluster
on your localm/c)
1. You need to have hadoop binaries locally
2. You need to have hdfs-site on Spark Classpath of your local m/c

I would suggest you to start off with local files to play around.

If you need to run spark on CDH cluster using Yarn, then you need to use
spark-submit to yarn cluster. You can see a very good example here:
https://spark.apache.org/docs/latest/running-on-yarn.html

On Wed, Jul 15, 2015 at 10:36 PM, Jeskanen, Elina <elina.jeska...@cgi.com>
wrote:

>  I have Spark 1.4 on my local machine and I would like to connect to our
> local 4 nodes Cloudera cluster. But how?
>
>
>
> In the example it says text_file = spark.textFile("hdfs://..."), but can
> you advise me in where to get this "hdfs://..." -address?
>
>
>
> Thanks!
>
>
>
> Elina
>
>
>
>
>

-- 
Best Regards,
Ayan Guha

Re: Spark and HDFS

Reply via email to