I'm trying to understand how spark works under the hood, so I tried to read
the source code.

as I normally do, I downloaded the git source code, reverted to the very
first version ( actually e5c4cd8a5e188592f8786a265c0cd073c69ac886 since the
first version even lacked the definition of RDD.scala)

but the code looks "too simple" and I can't find where the "magic" happens,
i.e. a transformation /computation is scheduled on  a machine, bytes stored
etc.

it would be great if someone could show me a path in which the different
source files are involved, so that I could read each of them in turn.

thanks!
yang

Reply via email to