Did you actually observe a perf issue? On Mon, Jan 21, 2019 at 10:04 AM Sean Owen <sro...@gmail.com> wrote:
> The ClosureCleaner proactively checks that closures passed to > transformations like RDD.map() are serializable, before they're > executed. It does this by just serializing it with the JavaSerializer. > > That's a nice feature, although there's overhead in always trying to > serialize the closure ahead of time, especially if the closure is > large. It shouldn't be large, usually. But I noticed it when coming up > with this fix: https://github.com/apache/spark/pull/23600 > > It made me wonder, should this be optional, or even not the default? > Closures that don't serialize still fail, just later when an action is > invoked. I don't feel strongly about it, just checking if anyone had > pondered this before. > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >