Re: Kyro deserialisation error

Tathagata Das Thu, 17 Jul 2014 01:41:17 -0700

Seems like there is some sort of stream corruption, causing Kryo read to
read a weird class name from the stream (the name "arl Fridtjof Rode" in
the exception cannot be a class!).
Not sure how to debug this.


@Patrick: Any idea?



On Wed, Jul 16, 2014 at 10:14 PM, Hao Wang <wh.s...@gmail.com> wrote:

> I am not sure. Not every task will fail at this Kyro exception. In most
> time, the cluster could successfully finish the WikipediaPageRank.
> How could I debug this exception?
>
> Thanks
>
> Regards,
> Wang Hao(王灏)
>
> CloudTeam | School of Software Engineering
> Shanghai Jiao Tong University
> Address:800 Dongchuan Road, Minhang District, Shanghai, 200240
> Email:wh.s...@gmail.com
>
>
> On Thu, Jul 17, 2014 at 2:58 AM, Tathagata Das <
> tathagata.das1...@gmail.com> wrote:
>
>> Is the class that is not found in the wikipediapagerank jar?
>>
>> TD
>>
>>
>> On Wed, Jul 16, 2014 at 12:32 AM, Hao Wang <wh.s...@gmail.com> wrote:
>>
>>> Thanks for your reply. The SparkContext is configured as below:
>>>
>>>
>>>
>>>  sparkConf.setAppName("WikipediaPageRank")
>>>
>>>
>>>
>>>
>>>
>>>
>>>     sparkConf.set("spark.serializer", 
>>> "org.apache.spark.serializer.KryoSerializer")
>>>
>>>
>>>
>>>
>>>
>>>
>>>     sparkConf.set("spark.kryo.registrator",  
>>> classOf[PRKryoRegistrator].getName)
>>>
>>>
>>>
>>>
>>>
>>>
>>>     val inputFile = args(0)
>>>
>>>
>>>
>>>
>>>
>>>
>>>     val threshold = args(1).toDouble
>>>
>>>
>>>
>>>
>>>
>>>
>>>     val numPartitions = args(2).toInt
>>>
>>>
>>>
>>>
>>>
>>>
>>>     val usePartitioner = args(3).toBoolean
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>     sparkConf.setAppName("WikipediaPageRank")
>>>
>>>
>>>
>>>
>>>
>>>
>>>     sparkConf.set("spark.executor.memory", "60g")
>>>
>>>
>>>
>>>
>>>
>>>
>>>     sparkConf.set("spark.cores.max", "48")
>>>
>>>
>>>
>>>
>>>
>>>
>>>     sparkConf.set("spark.kryoserializer.buffer.mb", "24")
>>>
>>>
>>>
>>>
>>>
>>>
>>>     val sc = new SparkContext(sparkConf)
>>>
>>>
>>>
>>>
>>>
>>>
>>>     
>>> sc.addJar("~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar")
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> And I use spark-submit to run the application:
>>>
>>>
>>>
>>>
>>>
>>>
>>> ./bin/spark-submit --master spark://sing12:7077  --total-executor-cores 40 
>>> --executor-memory 40g --class 
>>> org.apache.spark.examples.bagel.WikipediaPageRank 
>>> ~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar 
>>> hdfs://192.168.1.12:9000/freebase-26G 1 200 True
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Regards,
>>> Wang Hao(王灏)
>>>
>>> CloudTeam | School of Software Engineering
>>> Shanghai Jiao Tong University
>>> Address:800 Dongchuan Road, Minhang District, Shanghai, 200240
>>> Email:wh.s...@gmail.com
>>>
>>>
>>> On Wed, Jul 16, 2014 at 1:41 PM, Tathagata Das <
>>> tathagata.das1...@gmail.com> wrote:
>>>
>>>> Are you using classes from external libraries that have not been added
>>>> to the sparkContext, using sparkcontext.addJar()?
>>>>
>>>> TD
>>>>
>>>>
>>>> On Tue, Jul 15, 2014 at 8:36 PM, Hao Wang <wh.s...@gmail.com> wrote:
>>>>
>>>>> I am running the WikipediaPageRank in Spark example and share the same
>>>>> problem with you:
>>>>>
>>>>> 4/07/16 11:31:06 DEBUG DAGScheduler: submitStage(Stage 6)
>>>>> 14/07/16 11:31:06 ERROR TaskSetManager: Task 6.0:450 failed 4 times;
>>>>> aborting job
>>>>> 14/07/16 11:31:06 INFO DAGScheduler: Failed to run foreach at
>>>>> Bagel.scala:251
>>>>> Exception in thread "main" 14/07/16 11:31:06 INFO TaskSchedulerImpl:
>>>>> Cancelling stage 6
>>>>> org.apache.spark.SparkException: Job aborted due to stage failure:
>>>>> Task 6.0:450 failed 4 times, most recent failure: Exception failure in TID
>>>>> 1330 on host sing11: com.esotericsoftware.kryo.KryoException: Unable to
>>>>> find class: arl Fridtjof Rode
>>>>>
>>>>> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>>>>>
>>>>> com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
>>>>>         com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
>>>>>
>>>>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721)
>>>>>
>>>>> com.twitter.chill.TraversableSerializer.read(Traversable.scala:44)
>>>>>
>>>>> com.twitter.chill.TraversableSerializer.read(Traversable.scala:21)
>>>>>
>>>>> com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>>>>
>>>>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:115)
>>>>>
>>>>> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125)
>>>>>
>>>>> org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
>>>>>
>>>>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>>>>>         scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>>>>>
>>>>> org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
>>>>>
>>>>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96)
>>>>>
>>>>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95)
>>>>>         org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
>>>>>
>>>>> Anyone cloud help?
>>>>>
>>>>> Regards,
>>>>> Wang Hao(王灏)
>>>>>
>>>>> CloudTeam | School of Software Engineering
>>>>> Shanghai Jiao Tong University
>>>>> Address:800 Dongchuan Road, Minhang District, Shanghai, 200240
>>>>> Email:wh.s...@gmail.com
>>>>>
>>>>>
>>>>> On Tue, Jun 3, 2014 at 8:02 PM, Denes <te...@outlook.com> wrote:
>>>>>
>>>>>> I tried to use Kryo as a serialiser isn spark streaming, did
>>>>>> everything
>>>>>> according to the guide posted on the spark website, i.e. added the
>>>>>> following
>>>>>> lines:
>>>>>>
>>>>>> conf.set("spark.serializer",
>>>>>> "org.apache.spark.serializer.KryoSerializer");
>>>>>> conf.set("spark.kryo.registrator", "MyKryoRegistrator");
>>>>>>
>>>>>> I also added the necessary classes to the MyKryoRegistrator.
>>>>>>
>>>>>> However I get the following strange error, can someone help me out
>>>>>> where to
>>>>>> look for a solution?
>>>>>>
>>>>>> 14/06/03 09:00:49 ERROR scheduler.JobScheduler: Error running job
>>>>>> streaming
>>>>>> job 1401778800000 ms.0
>>>>>> org.apache.spark.SparkException: Job aborted due to stage failure:
>>>>>> Exception
>>>>>> while deserializing and fetching task:
>>>>>> com.esotericsoftware.kryo.KryoException: Unable to find class: J
>>>>>> Serialization trace:
>>>>>> id (org.apache.spark.storage.GetBlock)
>>>>>>         at
>>>>>> org.apache.spark.scheduler.DAGScheduler.org
>>>>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
>>>>>>         at
>>>>>>
>>>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
>>>>>>         at
>>>>>>
>>>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
>>>>>>         at
>>>>>>
>>>>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>>>>>         at
>>>>>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>>>>>         at
>>>>>>
>>>>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
>>>>>>         at
>>>>>>
>>>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
>>>>>>         at
>>>>>>
>>>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
>>>>>>         at scala.Option.foreach(Option.scala:236)
>>>>>>         at
>>>>>>
>>>>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
>>>>>>         at
>>>>>>
>>>>>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
>>>>>>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>>>>>>         at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>>>>>>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>>>>>>         at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>>>>>>         at
>>>>>>
>>>>>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>>>>>>         at
>>>>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>>>>         at
>>>>>>
>>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>>>>         at
>>>>>>
>>>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>>>>         at
>>>>>>
>>>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Kyro-deserialisation-error-tp6798.html
>>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>>> Nabble.com.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Kyro deserialisation error

Reply via email to