two job context cannot share data, are you collecting the data to the master & then sending it to the other context?
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Wed, Jul 2, 2014 at 11:57 AM, Honey Joshi < honeyjo...@ideata-analytics.com> wrote: > On Wed, July 2, 2014 1:11 am, Mayur Rustagi wrote: > > Ideally you should be converting RDD to schemardd ? > > You are creating UnionRDD to join across dstream rdd? > > > > > > > > Mayur Rustagi > > Ph: +1 (760) 203 3257 > > http://www.sigmoidanalytics.com > > @mayur_rustagi <https://twitter.com/mayur_rustagi> > > > > > > > > > > On Tue, Jul 1, 2014 at 3:11 PM, Honey Joshi > > <honeyjo...@ideata-analytics.com > > > >> wrote: > >> > > > >> Hi, > >> I am trying to run a project which takes data as a DStream and dumps the > >> data in the Shark table after various operations. I am getting the > >> following error : > >> > >> Exception in thread "main" org.apache.spark.SparkException: Job > >> aborted: > >> Task 0.0:0 failed 1 times (most recent failure: Exception failure: > >> java.lang.ClassCastException: org.apache.spark.rdd.UnionPartition cannot > >> be cast to org.apache.spark.rdd.HadoopPartition) at > >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$sched > >> uler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028) > >> at > >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$sched > >> uler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026) > >> at > >> > >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.sc > >> ala:59) > >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > >> at org.apache.spark.scheduler.DAGScheduler.org > >> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:102 > >> 6) > >> at > >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply( > >> DAGScheduler.scala:619) > >> at > >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply( > >> DAGScheduler.scala:619) > >> at scala.Option.foreach(Option.scala:236) at > >> > >> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala > >> :619) > >> at > >> > >> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonf > >> un$receive$1.applyOrElse(DAGScheduler.scala:207) > >> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at > >> akka.actor.ActorCell.invoke(ActorCell.scala:456) > >> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at > >> akka.dispatch.Mailbox.run(Mailbox.scala:219) > >> at > >> > >> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(Abstra > >> ctDispatcher.scala:386) > >> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > >> at > >> > >> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.j > >> ava:1339) > >> at > >> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979 > >> ) > >> at > >> > >> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread > >> .java:107) > >> > >> > >> Can someone please explain the cause of this error, I am also using a > >> Spark Context with the existing Streaming Context. > >> > >> > > > > I am using spark 0.9.0-Incubating, so it doesnt have anything to do with > schemaRDD.This error is probably coming when I am trying to use one spark > context and one shark context in the same job.Is there any way to > incorporate two context in one job? > Regards > > Honey Joshi > Ideata-Analytics > >