This code executes on the driver, and an "RDD" here is really just a
handle on all the distributed data out there. It's a local bookkeeping
object. So, manipulation of these objects themselves in the local
driver code has virtually no performance impact. These two versions
would be about identical*
how about using fluent style of Scala programming.
On Fri, Nov 14, 2014 at 8:31 AM, Simone Franzini
wrote:
> Let's say I have to apply a complex sequence of operations to a certain
> RDD.
> In order to make code more modular/readable, I would typically have
> something like this:
>
> object myO