Hi, Sachin,  here are two posts about the basic concepts about spark:

   - spark-questions-concepts
   <http://litaotao.github.io/spark-questions-concepts?s=gmail>
   - deep-into-spark-exection-model
   <http://litaotao.github.io/deep-into-spark-exection-model?s=gmail>


And, I fully recommend databrick's post:
https://databricks.com/blog/2016/06/22/apache-spark-key-terms-explained.html


On Thu, Jul 21, 2016 at 1:36 AM, Jean Georges Perrin <j...@jgp.net> wrote:

> Hey,
>
> I love when questions are numbered, it's easier :)
>
> 1) Yes (but I am not an expert)
> 2) You don't control... One of my process is going to 8k tasks, so...
> 3) Yes, if you have HT, it double. My servers have 12 cores, but HT, so it
> makes 24.
> 4) From my understanding: Slave is the logical computational unit and
> Worker is really the one doing the job.
> 5) Dunnoh
> 6) Dunnoh
>
> On Jul 20, 2016, at 1:30 PM, Sachin Mittal <sjmit...@gmail.com> wrote:
>
> Hi,
> I was able to build and run my spark application via spark submit.
>
> I have understood some of the concepts by going through the resources at
> https://spark.apache.org but few doubts still remain. I have few specific
> questions and would be glad if someone could share some light on it.
>
> So I submitted the application using spark.master    local[*] and I have a
> 8 core PC.
>
> - What I understand is that application is called as job. Since mine had
> two stages it gets divided into 2 stages and each stage had number of tasks
> which ran in parallel.
> Is this understanding correct.
>
> - What I notice is that each stage is further divided into 262 tasks From
> where did this number 262 came from. Is this configurable. Would increasing
> this number improve performance.
>
> - Also I see that the tasks are run in parallel in set of 8. Is this
> because I have a 8 core PC.
>
> - What is the difference or relation between slave and worker. When I did
> spark-submit did it start 8 slaves or worker threads?
>
> - I see all worker threads running in one single JVM. Is this because I
> did not start  slaves separately and connect it to a single master cluster
> manager. If I had done that then each worker would have run in its own JVM.
>
> - What is the relationship between worker and executor. Can a worker have
> more than one executors? If yes then how do we configure that. Does all
> executor run in the worker JVM and are independent threads.
>
> I suppose that is all for now. Would appreciate any response.Will add
> followup questions if any.
>
> Thanks
> Sachin
>
>
>
>


-- 
*___________________*
Quant | Engineer | Boy
*___________________*
*blog*:    http://litaotao.github.io
<http://litaotao.github.io?utm_source=spark_mail>
*github*: www.github.com/litaotao

Reply via email to