subject:"Re\: Declaring multiple RDDs and efficiency concerns"

Re: Declaring multiple RDDs and efficiency concerns

2014-11-14 Thread Sean Owen

This code executes on the driver, and an "RDD" here is really just a handle on all the distributed data out there. It's a local bookkeeping object. So, manipulation of these objects themselves in the local driver code has virtually no performance impact. These two versions would be about identical*

Re: Declaring multiple RDDs and efficiency concerns

2014-11-14 Thread Rishi Yadav

how about using fluent style of Scala programming. On Fri, Nov 14, 2014 at 8:31 AM, Simone Franzini wrote: > Let's say I have to apply a complex sequence of operations to a certain > RDD. > In order to make code more modular/readable, I would typically have > something like this: > > object myO