To really be correct, I think you may have to use the foreach action
to persist your data, since this isn't idempotent, and then read it
again in a new RDD. You might get away with map as long as you can
ensure that your write process is idempotent.

On Fri, Dec 19, 2014 at 10:57 AM, ashic <as...@live.com> wrote:
> Hi,
> Say we have an operation that writes something to an external resource and
> gets some output. For example:
>
> val doSomething(entry:SomeEntry, session:Session) : SomeOutput = {
>     val result = session.SomeOp(entry)
>     SomeOutput(entry.Key, result.SomeProp)
> }
>
> I could use a transformation for rdd.map, but in case of failures, the map
> would run on another executor for the same rdd. I could do rdd.foreach, but
> that returns unit. Is there something like a foreach that can return values?
>
> Thanks,
> Ashic.
>
> PS: Resending to nabble email due to spam issues.
>
> ________________________________
> View this message in context: How to run an action and get output?‏
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to