Re: Is that possible to launch spark streaming application on yarn with only one machine?

Rabin Banerjee Wed, 06 Jul 2016 22:05:22 -0700

In yarn cluster mode , Driver is running in AM , so you can find the logs
in that AM log . Open rersourcemanager UI , and check for the Job and logs.
or yarn logs -applicationId <appId>


In yarn client mode , the driver is the same JVM from where you are
launching ,,So you are getting it in the log .

On Thu, Jul 7, 2016 at 7:56 AM, Yu Wei <yu20...@hotmail.com> wrote:

> Launching via client deploy mode, it works again.
>
> I'm still a little confused about the behavior difference for cluster and
> client mode on a single machine.
>
>
> Thanks,
>
> Jared
> ------------------------------
> *From:* Mich Talebzadeh <mich.talebza...@gmail.com>
> *Sent:* Wednesday, July 6, 2016 9:46:11 PM
> *To:* Yu Wei
> *Cc:* Deng Ching-Mallete; user@spark.apache.org
>
> *Subject:* Re: Is that possible to launch spark streaming application on
> yarn with only one machine?
>
> Deploy-mode cluster don't think will work.
>
> Try --master yarn --deploy-mode client
>
> FYI
>
>
>    -
>
>    *Spark Local* - Spark runs on the local host. This is the simplest set
>    up and best suited for learners who want to understand different concepts
>    of Spark and those performing unit testing.
>    -
>
>    *Spark Standalone *– a simple cluster manager included with Spark that
>    makes it easy to set up a cluster.
>    -
>
>    *YARN Cluster Mode,* the Spark driver runs inside an application
>    master process which is managed by YARN on the cluster, and the client can
>    go away after initiating the application. This is invoked with –master
>    yarn and --deploy-mode cluster
>    -
>
>    *YARN Client Mode*, the driver runs in the client process, and the
>    application master is only used for requesting resources from YARN. Unlike 
> Spark
>    standalone mode, in which the master’s address is specified in the
>    --master parameter, in YARN mode the ResourceManager’s address is
>    picked up from the Hadoop configuration. Thus, the --master parameter
>    is yarn. This is invoked with --deploy-mode client
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 6 July 2016 at 12:31, Yu Wei <yu20...@hotmail.com> wrote:
>
>> Hi Deng,
>>
>> I tried the same code again.
>>
>> It seemed that when launching application via yarn on single node,
>> JavaDStream.print() did not work. However, occasionally it worked.
>>
>> If launch the same application in local mode, it always worked.
>>
>>
>> The code is as below,
>>
>>         SparkConf conf = new SparkConf().setAppName("Monitor&Control");
>>         JavaStreamingContext jssc = new JavaStreamingContext(conf,
>> Durations.seconds(1));
>>         JavaReceiverInputDStream<String> inputDS =
>> MQTTUtils.createStream(jssc, "tcp://114.55.145.185:1883", "Control");
>>         inputDS.print();
>>         jssc.start();
>>         jssc.awaitTermination();
>>
>>
>> Command for launching via yarn, (did not work)
>>
>> spark-submit --master yarn --deploy-mode cluster --driver-memory 4g
>> --executor-memory 2g target/CollAna-1.0-SNAPSHOT.jar
>>  Command for launching via local mode (works)
>>    spark-submit --master local[4] --driver-memory 4g --executor-memory 2g
>> --num-executors 4 target/CollAna-1.0-SNAPSHOT.jar
>>
>>
>>
>> Any advice?
>>
>>
>> Thanks,
>>
>> Jared
>>
>>
>>
>> ------------------------------
>> *From:* Yu Wei <yu20...@hotmail.com>
>> *Sent:* Tuesday, July 5, 2016 4:41 PM
>> *To:* Deng Ching-Mallete
>>
>> *Cc:* user@spark.apache.org
>> *Subject:* Re: Is that possible to launch spark streaming application on
>> yarn with only one machine?
>>
>>
>> Hi Deng,
>>
>>
>> Thanks for the help. Actually I need pay more attention to memory usage.
>>
>> I found the root cause in my problem. It seemed that it existed in spark
>> streaming MQTTUtils module.
>>
>> When I use "localhost" in brokerURL, it doesn't work.
>>
>> After change it to "127.0.0.1", it works now.
>>
>>
>> Thanks again,
>>
>> Jared
>>
>>
>>
>> ------------------------------
>> *From:* odeach...@gmail.com <odeach...@gmail.com> on behalf of Deng
>> Ching-Mallete <och...@apache.org>
>> *Sent:* Tuesday, July 5, 2016 4:03:28 PM
>> *To:* Yu Wei
>> *Cc:* user@spark.apache.org
>> *Subject:* Re: Is that possible to launch spark streaming application on
>> yarn with only one machine?
>>
>> Hi Jared,
>>
>> You can launch a Spark application even with just a single node in YARN,
>> provided that the node has enough resources to run the job.
>>
>> It might also be good to note that when YARN calculates the memory
>> allocation for the driver and the executors, there is an additional memory
>> overhead that is added for each executor then it gets rounded up to the
>> nearest GB, IIRC. So the 4G driver-memory + 4x2G executor memory do not
>> necessarily translate to a total of 12G memory allocation. It would be more
>> than that, so the node would need to have more than 12G of memory for the
>> job to execute in YARN. You should be able to see something like "No
>> resources available in cluster.." in the application master logs in YARN if
>> that is the case.
>>
>> HTH,
>> Deng
>>
>> On Tue, Jul 5, 2016 at 4:31 PM, Yu Wei <yu20...@hotmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I set up pseudo hadoop/yarn cluster on my labtop.
>>>
>>> I wrote a simple spark streaming program as below to receive messages
>>> with MQTTUtils.
>>> conf = new SparkConf().setAppName("Monitor&Control");
>>> jssc = new JavaStreamingContext(conf, Durations.seconds(1));
>>> JavaReceiverInputDStream<String> inputDS = MQTTUtils.createStream(jssc,
>>> brokerUrl, topic);
>>>
>>> inputDS.print();
>>> jssc.start();
>>> jssc.awaitTermination()
>>>
>>> If I submitted the app with "--master local[2]", it works well.
>>>
>>> spark-submit --master local[4] --driver-memory 4g --executor-memory 2g
>>> --num-executors 4 target/CollAna-1.0-SNAPSHOT.jar
>>>
>>> If I submitted with "--master yarn",  no output for "inputDS.print()".
>>>
>>> spark-submit --master yarn --deploy-mode cluster --driver-memory 4g
>>> --executor-memory 2g --num-executors 4 target/CollAna-1.0-SNAPSHOT.jar
>>>
>>> Is it possible to launch spark application on yarn with only one single
>>> node?
>>>
>>>
>>> Thanks for your advice.
>>>
>>>
>>> Jared
>>>
>>>
>>>
>>
>

Re: Is that possible to launch spark streaming application on yarn with only one machine?

Reply via email to