Re: How Spark Execute chaining vs no chaining statements

2015-06-23 Thread Richard Marscher
There should be no difference assuming you don't use the intermediately stored rdd values you are creating for anything else (rdd1, rdd2). In the first example it still is creating these intermediate rdd objects you are just using them implicitly and not storing the value. It's also worth pointing

How Spark Execute chaining vs no chaining statements

2015-06-23 Thread Ashish Soni
Hi All , What is difference between below in terms of execution to the cluster with 1 or more worker node rdd.map(...).map(...)...map(..) vs val rdd1 = rdd.map(...) val rdd2 = rdd1.map(...) val rdd3 = rdd2.map(...) Thanks, Ashish