Thank you all sirs
Appreciated Mich your clarification.
On Sunday, 19 June 2016, 19:31, Mich Talebzadeh
wrote:
Thanks Jonathan for your points
I am aware of the fact yarn-client and yarn-cluster are both depreciated (still
work in 1.6.1), hence the new nomenclature.
Bear in mind this
Thanks Jonathan for your points
I am aware of the fact yarn-client and yarn-cluster are both depreciated
(still work in 1.6.1), hence the new nomenclature.
Bear in mind this is what I stated in my notes:
"YARN Cluster Mode, the Spark driver runs inside an application master
process which is mana
Mich, what Jacek is saying is not that you implied that YARN relies on two
masters. He's just clarifying that yarn-client and yarn-cluster modes are
really both using the same (type of) master (simply "yarn"). In fact, if
you specify "--master yarn-client" or "--master yarn-cluster", spark-submit
w
Good points but I am an experimentalist
In Local mode I have this
In local mode with:
--master local
This will start with one thread or equivalent to –master local[1]. You can
also start by more than one thread by specifying the number of threads *k*
in –master local[k]. You can also start us
On Sun, Jun 19, 2016 at 12:30 PM, Mich Talebzadeh
wrote:
> Spark Local - Spark runs on the local host. This is the simplest set up and
> best suited for learners who want to understand different concepts of Spark
> and those performing unit testing.
There are also the less-common master URLs:
*
Spark works on different modes, either local (Spark or anything else does
not manager) resources and standalone (Spark itself manages resources)
plus others (see below)
These are from my notes, excluding mesos that I have not used
- Spark Local - Spark runs on the local host. This is the sim
There are many technical differences inside though, how to use is the
almost same with each other.
yea, in a standalone mode, spark runs in a cluster way: see
http://spark.apache.org/docs/1.6.1/cluster-overview.html
// maropu
On Sun, Jun 19, 2016 at 6:14 PM, Ashok Kumar wrote:
> thank you
>
> W
thank you
What are the main differences between a local mode and standalone mode. I
understand local mode does not support cluster. Is that the only difference?
On Sunday, 19 June 2016, 9:52, Takeshi Yamamuro
wrote:
Hi,
In a local mode, spark runs in a single JVM that has a master an
Hi,
In a local mode, spark runs in a single JVM that has a master and one
executor with `k` threads.
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/local/LocalSchedulerBackend.scala#L94
// maropu
On Sun, Jun 19, 2016 at 5:39 PM, Ashok Kumar
wrote:
>
Hi,
Did you resolve this? I have the same questions.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Running-Spark-in-Local-Mode-tp22279p23278.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
--
Seems to be running OK with 4 threads, 16 threads... While running with 32
threads I started getting the below.
15/05/11 19:48:46 WARN executor.Executor: Issue communicating with driver
in heartbeater
org.apache.spark.SparkException: Error sending message [message =
Heartbeat(,[Lscala.Tuple2;@7668
Thanks, Sean. This was not yet digested data for me :)
"The number of partitions in a streaming RDD is determined by the
block interval and the batch interval." I have seen the bit on
spark.streaming.blockInterval
in the doc but I didn't connect it with the batch interval and the number
of partit
You might have a look at the Spark docs to start. 1 batch = 1 RDD, but
1 RDD can have many partitions. And should, for scale. You do not
submit multiple jobs to get parallelism.
The number of partitions in a streaming RDD is determined by the block
interval and the batch interval. If you have a ba
Understood. We'll use the multi-threaded code we already have..
How are these execution slots filled up? I assume each slot is dedicated to
one submitted task. If that's the case, how is each task distributed then,
i.e. how is that task run in a multi-node fashion? Say 1000 batches/RDD's
are ext
BTW I think my comment was wrong as marcelo demonstrated. In
standalone mode you'd have one worker, and you do have one executor,
but his explanation is right. But, you certainly have execution slots
for each core.
Are you talking about your own user code? you can make threads, but
that's nothing
Sean,
How does this model actually work? Let's say we want to run one job as N
threads executing one particular task, e.g. streaming data out of Kafka
into a search engine. How do we configure our Spark job execution?
Right now, I'm seeing this job running as a single thread. And it's quite a
bi
Are you actually running anything that requires all those slots? e.g.,
locally, I get this with "local[16]", but only after I run something that
actually uses those 16 slots:
"Executor task launch worker-15" daemon prio=10 tid=0x7f4c80029800
nid=0x8ce waiting on condition [0x7f4c62493000]
You have one worker with one executor with 32 execution slots.
On Mon, May 11, 2015 at 9:52 PM, dgoldenberg wrote:
> Hi,
>
> Is there anything special one must do, running locally and submitting a job
> like so:
>
> spark-submit \
> --class "com.myco.Driver" \
> --master local[*]
Hi,
I think for local mode, the number N (N number of thread) basically equals
to N number of available cores in ONE executor(worker), not N workers. You
could image local[N] as have one worker with N cores. I'm not sure you
could set the memory usage for each thread, for Spark the memory is share
19 matches
Mail list logo