Re: org.apache.spark.SparkException: Task not serializable

Mina Aslani Mon, 06 Mar 2017 13:33:37 -0800

Thank you Ankur for the quick response, really appreciate it! Making the
class serializable resolved the exception!


Best regards,
Mina

On Mon, Mar 6, 2017 at 4:20 PM, Ankur Srivastava <ankur.srivast...@gmail.com
> wrote:

> The fix for this make your class Serializable. The reason being the
> closures you have defined in the class need to be serialized and copied
> over to all executor nodes.
>
> Hope this helps.
>
> Thanks
> Ankur
>
> On Mon, Mar 6, 2017 at 1:06 PM, Mina Aslani <aslanim...@gmail.com> wrote:
>
>> Hi,
>>
>> I am trying to start with spark and get number of lines of a text file in my 
>> mac, however I get
>>
>> org.apache.spark.SparkException: Task not serializable error on
>>
>> JavaRDD<String> logData = javaCtx.textFile(file);
>>
>> Please see below for the sample of code and the stackTrace.
>>
>> Any idea why this error is thrown?
>>
>> Best regards,
>>
>> Mina
>>
>> System.out.println("Creating Spark Configuration");
>> SparkConf javaConf = new SparkConf();
>> javaConf.setAppName("My First Spark Java Application");
>> javaConf.setMaster("PATH to my spark");
>> System.out.println("Creating Spark Context");
>> JavaSparkContext javaCtx = new JavaSparkContext(javaConf);
>> System.out.println("Loading the Dataset and will further process it");
>> String file = "file:///file.txt";
>> JavaRDD<String> logData = javaCtx.textFile(file);
>>
>> long numLines = logData.filter(new Function<String, Boolean>() {
>>    public Boolean call(String s) {
>>       return true;
>>    }
>> }).count();
>>
>> System.out.println("Number of Lines in the Dataset "+numLines);
>>
>> javaCtx.close();
>>
>> Exception in thread "main" org.apache.spark.SparkException: Task not 
>> serializable
>>      at 
>> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298)
>>      at 
>> org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
>>      at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108)
>>      at org.apache.spark.SparkContext.clean(SparkContext.scala:2094)
>>      at org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:387)
>>      at org.apache.spark.rdd.RDD$$anonfun$filter$1.apply(RDD.scala:386)
>>      at 
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>>      at 
>> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>>      at org.apache.spark.rdd.RDD.withScope(RDD.scala:362)
>>      at org.apache.spark.rdd.RDD.filter(RDD.scala:386)
>>      at org.apache.spark.api.java.JavaRDD.filter(JavaRDD.scala:78)
>>
>>
>

Re: org.apache.spark.SparkException: Task not serializable

Reply via email to