Re: Running Spark in local mode

2016-06-19 Thread Ashok Kumar
Thank you all sirs Appreciated Mich your clarification. On Sunday, 19 June 2016, 19:31, Mich Talebzadeh wrote: Thanks Jonathan for your points I am aware of the fact yarn-client and yarn-cluster are both depreciated (still work in 1.6.1), hence the new nomenclature. Bear in mind this

Re: Running Spark in local mode

2016-06-19 Thread Mich Talebzadeh
Thanks Jonathan for your points I am aware of the fact yarn-client and yarn-cluster are both depreciated (still work in 1.6.1), hence the new nomenclature. Bear in mind this is what I stated in my notes: "YARN Cluster Mode, the Spark driver runs inside an application master process which is mana

Re: Running Spark in local mode

2016-06-19 Thread Jonathan Kelly
Mich, what Jacek is saying is not that you implied that YARN relies on two masters. He's just clarifying that yarn-client and yarn-cluster modes are really both using the same (type of) master (simply "yarn"). In fact, if you specify "--master yarn-client" or "--master yarn-cluster", spark-submit w

Re: Running Spark in local mode

2016-06-19 Thread Mich Talebzadeh
Good points but I am an experimentalist In Local mode I have this In local mode with: --master local This will start with one thread or equivalent to –master local[1]. You can also start by more than one thread by specifying the number of threads *k* in –master local[k]. You can also start us

Re: Running Spark in local mode

2016-06-19 Thread Jacek Laskowski
On Sun, Jun 19, 2016 at 12:30 PM, Mich Talebzadeh wrote: > Spark Local - Spark runs on the local host. This is the simplest set up and > best suited for learners who want to understand different concepts of Spark > and those performing unit testing. There are also the less-common master URLs: *

Re: Running Spark in local mode

2016-06-19 Thread Mich Talebzadeh
Spark works on different modes, either local (Spark or anything else does not manager) resources and standalone (Spark itself manages resources) plus others (see below) These are from my notes, excluding mesos that I have not used - Spark Local - Spark runs on the local host. This is the sim

Re: Running Spark in local mode

2016-06-19 Thread Takeshi Yamamuro
There are many technical differences inside though, how to use is the almost same with each other. yea, in a standalone mode, spark runs in a cluster way: see http://spark.apache.org/docs/1.6.1/cluster-overview.html // maropu On Sun, Jun 19, 2016 at 6:14 PM, Ashok Kumar wrote: > thank you > > W

Re: Running Spark in local mode

2016-06-19 Thread Ashok Kumar
thank you  What are the main differences between a local mode and standalone mode. I understand local mode does not support cluster. Is that the only difference? On Sunday, 19 June 2016, 9:52, Takeshi Yamamuro wrote: Hi, In a local mode, spark runs in a single JVM that has a master an

Re: Running Spark in local mode

2016-06-19 Thread Takeshi Yamamuro
Hi, In a local mode, spark runs in a single JVM that has a master and one executor with `k` threads. https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/local/LocalSchedulerBackend.scala#L94 // maropu On Sun, Jun 19, 2016 at 5:39 PM, Ashok Kumar wrote: >

Re: Running Spark in Local Mode

2015-06-11 Thread mrm
Hi, Did you resolve this? I have the same questions. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-Spark-in-Local-Mode-tp22279p23278.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --

Re: Running Spark in local mode seems to ignore local[N]

2015-05-11 Thread Dmitry Goldenberg
Seems to be running OK with 4 threads, 16 threads... While running with 32 threads I started getting the below. 15/05/11 19:48:46 WARN executor.Executor: Issue communicating with driver in heartbeater org.apache.spark.SparkException: Error sending message [message = Heartbeat(,[Lscala.Tuple2;@7668

Re: Running Spark in local mode seems to ignore local[N]

2015-05-11 Thread Dmitry Goldenberg
Thanks, Sean. This was not yet digested data for me :) "The number of partitions in a streaming RDD is determined by the block interval and the batch interval." I have seen the bit on spark.streaming.blockInterval in the doc but I didn't connect it with the batch interval and the number of partit

Re: Running Spark in local mode seems to ignore local[N]

2015-05-11 Thread Sean Owen
You might have a look at the Spark docs to start. 1 batch = 1 RDD, but 1 RDD can have many partitions. And should, for scale. You do not submit multiple jobs to get parallelism. The number of partitions in a streaming RDD is determined by the block interval and the batch interval. If you have a ba

Re: Running Spark in local mode seems to ignore local[N]

2015-05-11 Thread Dmitry Goldenberg
Understood. We'll use the multi-threaded code we already have.. How are these execution slots filled up? I assume each slot is dedicated to one submitted task. If that's the case, how is each task distributed then, i.e. how is that task run in a multi-node fashion? Say 1000 batches/RDD's are ext

Re: Running Spark in local mode seems to ignore local[N]

2015-05-11 Thread Sean Owen
BTW I think my comment was wrong as marcelo demonstrated. In standalone mode you'd have one worker, and you do have one executor, but his explanation is right. But, you certainly have execution slots for each core. Are you talking about your own user code? you can make threads, but that's nothing

Re: Running Spark in local mode seems to ignore local[N]

2015-05-11 Thread Dmitry Goldenberg
Sean, How does this model actually work? Let's say we want to run one job as N threads executing one particular task, e.g. streaming data out of Kafka into a search engine. How do we configure our Spark job execution? Right now, I'm seeing this job running as a single thread. And it's quite a bi

Re: Running Spark in local mode seems to ignore local[N]

2015-05-11 Thread Marcelo Vanzin
Are you actually running anything that requires all those slots? e.g., locally, I get this with "local[16]", but only after I run something that actually uses those 16 slots: "Executor task launch worker-15" daemon prio=10 tid=0x7f4c80029800 nid=0x8ce waiting on condition [0x7f4c62493000]

Re: Running Spark in local mode seems to ignore local[N]

2015-05-11 Thread Sean Owen
You have one worker with one executor with 32 execution slots. On Mon, May 11, 2015 at 9:52 PM, dgoldenberg wrote: > Hi, > > Is there anything special one must do, running locally and submitting a job > like so: > > spark-submit \ > --class "com.myco.Driver" \ > --master local[*]

Re: Running Spark in Local Mode

2015-03-29 Thread Saisai Shao
Hi, I think for local mode, the number N (N number of thread) basically equals to N number of available cores in ONE executor(worker), not N workers. You could image local[N] as have one worker with N cores. I'm not sure you could set the memory usage for each thread, for Spark the memory is share