On Wed, July 2, 2014 1:11 am, Mayur Rustagi wrote: > Ideally you should be converting RDD to schemardd ? > You are creating UnionRDD to join across dstream rdd? > > > > Mayur Rustagi > Ph: +1 (760) 203 3257 > http://www.sigmoidanalytics.com > @mayur_rustagi <https://twitter.com/mayur_rustagi> > > > > > On Tue, Jul 1, 2014 at 3:11 PM, Honey Joshi > <honeyjo...@ideata-analytics.com > >> wrote: >> > >> Hi, >> I am trying to run a project which takes data as a DStream and dumps the >> data in the Shark table after various operations. I am getting the >> following error : >> >> Exception in thread "main" org.apache.spark.SparkException: Job >> aborted: >> Task 0.0:0 failed 1 times (most recent failure: Exception failure: >> java.lang.ClassCastException: org.apache.spark.rdd.UnionPartition cannot >> be cast to org.apache.spark.rdd.HadoopPartition) at >> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$sched >> uler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028) >> at >> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$sched >> uler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026) >> at >> >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.sc >> ala:59) >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> at org.apache.spark.scheduler.DAGScheduler.org >> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:102 >> 6) >> at >> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply( >> DAGScheduler.scala:619) >> at >> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply( >> DAGScheduler.scala:619) >> at scala.Option.foreach(Option.scala:236) at >> >> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala >> :619) >> at >> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonf >> un$receive$1.applyOrElse(DAGScheduler.scala:207) >> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at >> akka.actor.ActorCell.invoke(ActorCell.scala:456) >> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at >> akka.dispatch.Mailbox.run(Mailbox.scala:219) >> at >> >> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(Abstra >> ctDispatcher.scala:386) >> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >> at >> >> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.j >> ava:1339) >> at >> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979 >> ) >> at >> >> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread >> .java:107) >> >> >> Can someone please explain the cause of this error, I am also using a >> Spark Context with the existing Streaming Context. >> >> >
I am using spark 0.9.0-Incubating, so it doesnt have anything to do with schemaRDD.This error is probably coming when I am trying to use one spark context and one shark context in the same job.Is there any way to incorporate two context in one job? Regards Honey Joshi Ideata-Analytics