On Mon, Sep 8, 2014 at 11:52 AM, Dimension Data, LLC. <
subscripti...@didata.us> wrote:

>  So just to clarify for me: When specifying 'spark.yarn.jar' as I did
> above, even if I don't use HDFS to create a
> RDD (e.g. do something simple like: 'sc.parallelize(range(100))'), it is
> still necessary to configure the HDFS
> location in each NM's '/etc/hadoop/conf/*', just so that they can access
> the Spark Jar in the YARN case?
>

That's correct. In fact, I'm not aware of Yarn working at all without the
HDFS configuration being in place (even if the default fs is not HDFS), but
then I'm not a Yarn deployment expert.

-- 
Marcelo

Reply via email to