You’ll definitely want to use a Kryo-based serializer for Avro. We have a Kryo based serializer that wraps the Avro efficient serializer here.
Frank Austin Nothaft fnoth...@berkeley.edu fnoth...@eecs.berkeley.edu 202-340-0466 On Apr 3, 2015, at 5:41 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > Because, its throwing up serializable exceptions and kryo is a serializer to > serialize your objects. > > Thanks > Best Regards > > On Fri, Apr 3, 2015 at 5:37 PM, Deepak Jain <deepuj...@gmail.com> wrote: > I meant that I did not have to use kyro. Why will kyro help fix this issue > now ? > > Sent from my iPhone > > On 03-Apr-2015, at 5:36 pm, Deepak Jain <deepuj...@gmail.com> wrote: > >> I was able to write record that extends specificrecord (avro) this class was >> not auto generated. Do we need to do something extra for auto generated >> classes >> >> Sent from my iPhone >> >> On 03-Apr-2015, at 5:06 pm, Akhil Das <ak...@sigmoidanalytics.com> wrote: >> >>> This thread might give you some insights >>> http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E >>> >>> Thanks >>> Best Regards >>> >>> On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote: >>> My Spark Job failed with >>> >>> >>> 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: >>> saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s >>> 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: >>> Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not >>> serializable result: >>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum >>> Serialization stack: >>> - object not serializable (class: >>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: >>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": >>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null}) >>> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) >>> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0, >>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt": >>> null, "currPsLvlId": null})) >>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 >>> in stage 2.0 (TID 0) had a not serializable result: >>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum >>> Serialization stack: >>> - object not serializable (class: >>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: >>> {"userId": 0, "spsPrgrmId": 0, "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": >>> null, "spsSlrLevelSumEndDt": null, "currPsLvlId": null}) >>> - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) >>> - object (class scala.Tuple2, (0,{"userId": 0, "spsPrgrmId": 0, >>> "spsSlrLevelCd": 0, "spsSlrLevelSumStartDt": null, "spsSlrLevelSumEndDt": >>> null, "currPsLvlId": null})) >>> at >>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191) >>> at >>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>> at >>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) >>> at scala.Option.foreach(Option.scala:236) >>> at >>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) >>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >>> 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, >>> exitCode: 15, (reason: User class threw exception: Job aborted due to stage >>> failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: >>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum >>> Serialization stack: >>> >>> .... >>> >>> com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto >>> generated through avro schema using avro-generate-sources maven pulgin. >>> >>> >>> package com.ebay.ep.poc.spark.reporting.process.model.dw; >>> >>> @SuppressWarnings("all") >>> >>> @org.apache.avro.specific.AvroGenerated >>> >>> public class SpsLevelMetricSum extends >>> org.apache.avro.specific.SpecificRecordBase implements >>> org.apache.avro.specific.SpecificRecord { >>> >>> ... >>> ... >>> } >>> >>> Can anyone suggest how to fix this ? >>> >>> >>> >>> -- >>> Deepak >>> >>> >