Re: submit_spark_job_to_YARN

Dawid Wysakowicz Sun, 30 Aug 2015 12:12:50 -0700

Hi Ajay,

In short story: No, there is no easy way to do that. But if you'd like to
play around this topic a good starting point would be this blog post from
sequenceIQ: blog
<http://blog.sequenceiq.com/blog/2014/08/22/spark-submit-in-java/>.


I heard rumors that there are some work going on to prepare Submit API, but
I am not a contributor and I can't say neither if it is true nor how are
the works going on.
For now the suggested way is to use the provided script: spark-submit.

Regards
Dawid

2015-08-30 20:54 GMT+02:00 Ajay Chander <itsche...@gmail.com>:

> Hi David,
>
> Thanks for responding! My main intention was to submit spark Job/jar to
> yarn cluster from my eclipse with in the code. Is there any way that I
> could pass my yarn configuration somewhere in the code to submit the jar to
> the cluster?
>
> Thank you,
> Ajay
>
>
> On Sunday, August 30, 2015, David Mitchell <jdavidmitch...@gmail.com>
> wrote:
>
>> Hi Ajay,
>>
>> Are you trying to save to your local file system or to HDFS?
>>
>> // This would save to HDFS under "/user/hadoop/counter"
>> counter.saveAsTextFile("/user/hadoop/counter");
>>
>> David
>>
>>
>> On Sun, Aug 30, 2015 at 11:21 AM, Ajay Chander <itsche...@gmail.com>
>> wrote:
>>
>>> Hi Everyone,
>>>
>>> Recently we have installed spark on yarn in hortonworks cluster. Now I
>>> am trying to run a wordcount program in my eclipse and I
>>> did setMaster("local") and I see the results that's as expected. Now I want
>>> to submit the same job to my yarn cluster from my eclipse. In storm
>>> basically I was doing the same by using StormSubmitter class and by passing
>>> nimbus & zookeeper host to Config object. I was looking for something
>>> exactly the same.
>>>
>>> When I went through the documentation online, it read that I am suppose
>>> to "export HADOOP_HOME_DIR=path to the conf dir". So now I copied the conf
>>> folder from one of sparks gateway node to my local Unix box. Now I did
>>> export that dir...
>>>
>>> export HADOOP_HOME_DIR=/Users/user1/Documents/conf/
>>>
>>> And I did the same in .bash_profile too. Now when I do echo
>>> $HADOOP_HOME_DIR, I see the path getting printed in the command prompt. Now
>>> my assumption is, in my program when I change setMaster("local") to
>>> setMaster("yarn-client") my program should pick up the resource mangers i.e
>>> yarn cluster info from the directory which I have exported and the job
>>> should get submitted to resolve manager from my eclipse. But somehow it's
>>> not happening. Please tell me if my assumption is wrong or if I am missing
>>> anything here.
>>>
>>> I have attached the word count program that I was using. Any help is
>>> highly appreciated.
>>>
>>> Thank you,
>>> Ajay
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>
>>
>>
>> --
>> ### Confidential e-mail, for recipient's (or recipients') eyes only, not
>> for distribution. ###
>>
>

Re: submit_spark_job_to_YARN

Reply via email to