I do not understand Chinese but the diagrams on that page are very helpful.
On Tue, Jan 6, 2015 at 9:46 PM, eric wong <win19...@gmail.com> wrote: > A good beginning if you are chinese. > > https://github.com/JerryLead/SparkInternals/tree/master/markdown > > 2015-01-07 10:13 GMT+08:00 bit1...@163.com <bit1...@163.com>: > >> Thank you, Tobias. I will look into the Spark paper. But it looks that >> the paper has been moved, >> http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf. >> A web page is returned (Resource not found)when I access it. >> >> ------------------------------ >> bit1...@163.com >> >> >> *From:* Tobias Pfeiffer <t...@preferred.jp> >> *Date:* 2015-01-07 09:24 >> *To:* Todd <bit1...@163.com> >> *CC:* user <user@spark.apache.org> >> *Subject:* Re: I think I am almost lost in the internals of Spark >> Hi, >> >> On Tue, Jan 6, 2015 at 11:24 PM, Todd <bit1...@163.com> wrote: >> >>> I am a bit new to Spark, except that I tried simple things like word >>> count, and the examples given in the spark sql programming guide. >>> Now, I am investigating the internals of Spark, but I think I am almost >>> lost, because I could not grasp a whole picture what spark does when it >>> executes the word count. >>> >> >> I recommend understanding what an RDD is and how it is processed, using >> >> http://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds >> and probably also >> http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf >> (once the server is back). >> Understanding how an RDD is processed is probably most helpful to >> understand the whole of Spark. >> >> Tobias >> >> > > > -- > 王海华 >