Hi Ian,
If I understand what you're after, you might find "zip" useful. From the docs:
Zips this RDD with another one, returning key-value pairs with the first
element in each RDD, second element in each RDD, etc. Assumes that the two RDDs
have the *same number of partitions* and the *same numb
This may seem contrived but, suppose I wanted to create a collection of
"single column" RDD's that contain calculated values, so I want to cache
these to avoid re-calc.
i.e.
rdd1 = {Names]
rdd2 = {Star Sign}
rdd3 = {Age}
Then I want to create a new virtual RDD that is a collection of thes