Re: Combining RDD's columns

2014-04-18 Thread Jeremy Freeman
Hi Ian, If I understand what you're after, you might find "zip" useful. From the docs: Zips this RDD with another one, returning key-value pairs with the first element in each RDD, second element in each RDD, etc. Assumes that the two RDDs have the *same number of partitions* and the *same numb

Combining RDD's columns

2014-04-18 Thread Ian Ferreira
This may seem contrived but, suppose I wanted to create a collection of "single column" RDD's that contain calculated values, so I want to cache these to avoid re-calc. i.e. rdd1 = {Names] rdd2 = {Star Sign} rdd3 = {Age} Then I want to create a new virtual RDD that is a collection of thes