Hi, On Tue, Jan 6, 2015 at 11:24 PM, Todd <bit1...@163.com> wrote:
> I am a bit new to Spark, except that I tried simple things like word > count, and the examples given in the spark sql programming guide. > Now, I am investigating the internals of Spark, but I think I am almost > lost, because I could not grasp a whole picture what spark does when it > executes the word count. > I recommend understanding what an RDD is and how it is processed, using http://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds and probably also http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf (once the server is back). Understanding how an RDD is processed is probably most helpful to understand the whole of Spark. Tobias