I see the "rdd.dependencies()" function, does that include ALL the dependencies of an RDD? Is it safe to assume I can say "rdd2.dependencies.contains(rdd1)"?
On Thu, Feb 26, 2015 at 4:28 PM, Corey Nolet <cjno...@gmail.com> wrote: > Let's say I'm given 2 RDDs and told to store them in a sequence file and > they have the following dependency: > > val rdd1 = sparkContext.sequenceFile().....cache() > val rdd2 = rdd1.map(....).... > > > How would I tell programmatically without being the one who built rdd1 and > rdd2 whether or not rdd2 depends on rdd1? > > I'm working on a concurrency model for my application and I won't > necessarily know how the two rdds are constructed. What I will know is > whether or not rdd1 is cached but i want to maximum concurrency and run > rdd1 and rdd2 together if rdd2 does not depend on rdd1. > >