Hi, Sachin, there is no planning on translate these into english currently, sorry for that, but you can check databrick's blog, there are lots of high-quality and easy-understanding posts.
or you can check the list in this post of mine, choose the English version: - spark-resouces-blogs-paper <http://litaotao.github.io/spark-resouces-blogs-paper?s=gmail> On Thu, Jul 21, 2016 at 12:19 PM, Sachin Mittal <sjmit...@gmail.com> wrote: > Hi, > Thanks for the links, is there any english translation for the same? > > Sachin > > > On Thu, Jul 21, 2016 at 8:34 AM, Taotao.Li <charles.up...@gmail.com> > wrote: > >> Hi, Sachin, here are two posts about the basic concepts about spark: >> >> >> - spark-questions-concepts >> <http://litaotao.github.io/spark-questions-concepts?s=gmail> >> - deep-into-spark-exection-model >> <http://litaotao.github.io/deep-into-spark-exection-model?s=gmail> >> >> >> And, I fully recommend databrick's post: >> https://databricks.com/blog/2016/06/22/apache-spark-key-terms-explained.html >> >> >> On Thu, Jul 21, 2016 at 1:36 AM, Jean Georges Perrin <j...@jgp.net> wrote: >> >>> Hey, >>> >>> I love when questions are numbered, it's easier :) >>> >>> 1) Yes (but I am not an expert) >>> 2) You don't control... One of my process is going to 8k tasks, so... >>> 3) Yes, if you have HT, it double. My servers have 12 cores, but HT, so >>> it makes 24. >>> 4) From my understanding: Slave is the logical computational unit and >>> Worker is really the one doing the job. >>> 5) Dunnoh >>> 6) Dunnoh >>> >>> On Jul 20, 2016, at 1:30 PM, Sachin Mittal <sjmit...@gmail.com> wrote: >>> >>> Hi, >>> I was able to build and run my spark application via spark submit. >>> >>> I have understood some of the concepts by going through the resources at >>> https://spark.apache.org but few doubts still remain. I have few >>> specific questions and would be glad if someone could share some light on >>> it. >>> >>> So I submitted the application using spark.master local[*] and I have >>> a 8 core PC. >>> >>> - What I understand is that application is called as job. Since mine had >>> two stages it gets divided into 2 stages and each stage had number of tasks >>> which ran in parallel. >>> Is this understanding correct. >>> >>> - What I notice is that each stage is further divided into 262 tasks >>> From where did this number 262 came from. Is this configurable. Would >>> increasing this number improve performance. >>> >>> - Also I see that the tasks are run in parallel in set of 8. Is this >>> because I have a 8 core PC. >>> >>> - What is the difference or relation between slave and worker. When I >>> did spark-submit did it start 8 slaves or worker threads? >>> >>> - I see all worker threads running in one single JVM. Is this because I >>> did not start slaves separately and connect it to a single master cluster >>> manager. If I had done that then each worker would have run in its own JVM. >>> >>> - What is the relationship between worker and executor. Can a worker >>> have more than one executors? If yes then how do we configure that. Does >>> all executor run in the worker JVM and are independent threads. >>> >>> I suppose that is all for now. Would appreciate any response.Will add >>> followup questions if any. >>> >>> Thanks >>> Sachin >>> >>> >>> >>> >> >> >> -- >> *___________________* >> Quant | Engineer | Boy >> *___________________* >> *blog*: http://litaotao.github.io >> <http://litaotao.github.io?utm_source=spark_mail> >> *github*: www.github.com/litaotao >> > > -- *___________________* Quant | Engineer | Boy *___________________* *blog*: http://litaotao.github.io <http://litaotao.github.io?utm_source=spark_mail> *github*: www.github.com/litaotao