Github user willb commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37686459 Yes, my understanding of SPARK-897 is that the issue is ensuring serializability errors are reported to the user as soon as possible. And essentially what these commits do is replicate the closure-serializability check (which, as you note, occurs now in the scheduler as part of job submission) in `ClosureCleaner.clean`, which is called for every closure argument to RDD transformation methods in the driver. (The test cases I added in f2ef54e check to see that unserializable-closure failures happen immediately on transformation invocation, not merely after actions occur.)
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---