i am surprised union introduces a stage. UnionRDD should have only narrow dependencies.
On Tue, Feb 2, 2016 at 11:25 PM, Koert Kuipers <ko...@tresata.com> wrote: > well the "hadoop" way is to save to a/b and a/c and read from a/* :) > > On Tue, Feb 2, 2016 at 11:05 PM, Jerry Lam <chiling...@gmail.com> wrote: > >> Hi Spark users and developers, >> >> anyone knows how to union two RDDs without the overhead of it? >> >> say rdd1.union(rdd2).saveTextFile(..) >> This requires a stage to union the 2 rdds before saveAsTextFile (2 >> stages). Is there a way to skip the union step but have the contents of the >> two rdds save to the same output text file? >> >> Thank you! >> >> Jerry >> > >