To really be correct, I think you may have to use the foreach action to persist your data, since this isn't idempotent, and then read it again in a new RDD. You might get away with map as long as you can ensure that your write process is idempotent.
On Fri, Dec 19, 2014 at 10:57 AM, ashic <as...@live.com> wrote: > Hi, > Say we have an operation that writes something to an external resource and > gets some output. For example: > > val doSomething(entry:SomeEntry, session:Session) : SomeOutput = { > val result = session.SomeOp(entry) > SomeOutput(entry.Key, result.SomeProp) > } > > I could use a transformation for rdd.map, but in case of failures, the map > would run on another executor for the same rdd. I could do rdd.foreach, but > that returns unit. Is there something like a foreach that can return values? > > Thanks, > Ashic. > > PS: Resending to nabble email due to spam issues. > > ________________________________ > View this message in context: How to run an action and get output? > Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org