I can suggest two things: 1. While creating worker, submitting task make sure you are not keeping any unwanted external class resource (which is not used in closure and not serializable) 2. If this is ensured and you still get some issue from 3rd party library you can make thet 3rd party variable reference as transient in your code and define private static void readObject(is: ObjectInputStream) method to initialize that particular variable.
e.g. class MyClass extends Serializable { @transient private var ref = initRef() this is a 3rd party variable which is not serialzable .... private <> initRef() { ref = .... return ref } private static void readObject(is: ObjectInputStream) { is.defaultReadObject() // this is to follow the java default serialzation for all other parameters ref = initRef() } } Thanks, Sourav On Mon, Mar 24, 2014 at 3:06 PM, santhoma <santhosh.tho...@yahoo.com> wrote: > I am also facing the same problem. I have implemented Serializable for my > code, but the exception is thrown from third party libraries on which I > have > no control . > > Exception in thread "main" org.apache.spark.SparkException: Job aborted: > Task not serializable: java.io.NotSerializableException: (lib class name > here) > at > > org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028) > > Is it mandatory that Serializable must be implemented for dependent jars as > well? > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Java-API-Serialization-Issue-tp1460p3086.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > -- Sourav Chandra Senior Software Engineer · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · sourav.chan...@livestream.com o: +91 80 4121 8723 m: +91 988 699 3746 skype: sourav.chandra Livestream "Ajmera Summit", First Floor, #3/D, 68 Ward, 3rd Cross, 7th C Main, 3rd Block, Koramangala Industrial Area, Bangalore 560034 www.livestream.com