Re: Error: UnionPartition cannot be cast to org.apache.spark.rdd.HadoopPartition

Mayur Rustagi Wed, 02 Jul 2014 00:01:24 -0700

two job context cannot share data, are you collecting the data to the
master & then sending it to the other context?


Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Wed, Jul 2, 2014 at 11:57 AM, Honey Joshi <
honeyjo...@ideata-analytics.com> wrote:

> On Wed, July 2, 2014 1:11 am, Mayur Rustagi wrote:
> > Ideally you should be converting RDD to schemardd ?
> > You are creating UnionRDD to join across dstream rdd?
> >
> >
> >
> > Mayur Rustagi
> > Ph: +1 (760) 203 3257
> > http://www.sigmoidanalytics.com
> > @mayur_rustagi <https://twitter.com/mayur_rustagi>
> >
> >
> >
> >
> > On Tue, Jul 1, 2014 at 3:11 PM, Honey Joshi
> > <honeyjo...@ideata-analytics.com
> >
> >> wrote:
> >>
> >
> >> Hi,
> >> I am trying to run a project which takes data as a DStream and dumps the
> >>  data in the Shark table after various operations. I am getting the
> >> following error :
> >>
> >> Exception in thread "main" org.apache.spark.SparkException: Job
> >> aborted:
> >> Task 0.0:0 failed 1 times (most recent failure: Exception failure:
> >> java.lang.ClassCastException: org.apache.spark.rdd.UnionPartition cannot
> >>  be cast to org.apache.spark.rdd.HadoopPartition) at
> >>
> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$sched
> >> uler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028)
> >> at
> >>
> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$sched
> >> uler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026)
> >> at
> >>
> >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.sc
> >> ala:59)
> >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> >> at org.apache.spark.scheduler.DAGScheduler.org
> >> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:102
> >> 6)
> >> at
> >>
> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(
> >> DAGScheduler.scala:619)
> >> at
> >>
> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(
> >> DAGScheduler.scala:619)
> >> at scala.Option.foreach(Option.scala:236) at
> >>
> >> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala
> >> :619)
> >> at
> >>
> >> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonf
> >> un$receive$1.applyOrElse(DAGScheduler.scala:207)
> >> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at
> >> akka.actor.ActorCell.invoke(ActorCell.scala:456)
> >> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at
> >> akka.dispatch.Mailbox.run(Mailbox.scala:219)
> >> at
> >>
> >> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(Abstra
> >> ctDispatcher.scala:386)
> >> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> >> at
> >>
> >> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.j
> >> ava:1339)
> >> at
> >> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979
> >> )
> >> at
> >>
> >> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread
> >> .java:107)
> >>
> >>
> >> Can someone please explain the cause of this error, I am also using a
> >> Spark Context with the existing Streaming Context.
> >>
> >>
> >
>
> I am using spark 0.9.0-Incubating, so it doesnt have anything to do with
> schemaRDD.This error is probably coming when I am trying to use one spark
> context and one shark context in the same job.Is there any way to
> incorporate two context in one job?
> Regards
>
> Honey Joshi
> Ideata-Analytics
>
>

Re: Error: UnionPartition cannot be cast to org.apache.spark.rdd.HadoopPartition

Reply via email to