Is there a way to have the exception handling go lazily along with the definition?
e.g... we define it on the RDD but then our exception handling code gets triggered on that first action... without us having to define it on the first action? (e.g. that RDD code is boilerplate and we want to just have it in many many projects) m On Wed, Mar 11, 2015 at 10:08 AM, Sean Owen <so...@cloudera.com> wrote: > Handling exceptions this way means handling errors on the driver side, > which may or may not be what you want. You can also write functions > with exception handling inside, which could make more sense in some > cases (like, to ignore bad records or count them or something). > > If you want to handle errors at every step on the driver side, you > have to force RDDs to materialize to see if they "work". You can do > that with .count() or .take(1).length > 0. But to avoid recomputing > the RDD then, it needs to be cached. So there is a big non-trivial > overhead to approaching it this way. > > If you go this way, consider materializing only a few key RDDs in your > flow, not every one. > > The most natural thing is indeed to handle exceptions where the action > occurs. > > > On Wed, Mar 11, 2015 at 1:51 PM, Michal Klos <michal.klo...@gmail.com> > wrote: > > Hi Spark Community, > > > > We would like to define exception handling behavior on RDD instantiation > / > > build. Since the RDD is lazily evaluated, it seems like we are forced to > put > > all exception handling in the first action call? > > > > This is an example of something that would be nice: > > > > def myRDD = { > > Try { > > val rdd = sc.textFile(...) > > } match { > > Failure(e) => Handle ... > > } > > } > > > > myRDD.reduceByKey(...) //don't need to worry about that exception here > > > > The reason being that we want to try to avoid having to copy paste > exception > > handling boilerplate on every first action. We would love to define this > > once somewhere for the RDD build code and just re-use. > > > > Is there a best practice for this? Are we missing something here? > > > > thanks, > > Michal >