Are the hadoop configuration files on the classpath for your mesos executors?
On Thu, Jul 3, 2014 at 6:45 PM, Steven Cox <s...@renci.org> wrote: > ...and a real subject line. > ------------------------------ > *From:* Steven Cox [s...@renci.org] > *Sent:* Thursday, July 03, 2014 9:21 PM > *To:* user@spark.apache.org > *Subject:* > > Folks, I have a program derived from the Kafka streaming wordcount > example which works fine standalone. > > > Running on Mesos is not working so well. For starters, I get the error > below "No FileSystem for scheme: hdfs". > > > I've looked at lots of promising comments on this issue so now I have - > > * Every jar under hadoop in my classpath > > * Hadoop HDFS and Client in my pom.xml > > > I find it odd that the app writes checkpoint files to HDFS successfully > for a couple of cycles then throws this exception. This would suggest the > problem is not with the syntax of the hdfs URL, for example. > > > Any thoughts on what I'm missing? > > > Thanks, > > > Steve > > > Mesos : 0.18.2 > > Spark : 0.9.1 > > > > 14/07/03 21:14:20 WARN TaskSetManager: Lost TID 296 (task 1514.0:0) > > 14/07/03 21:14:20 WARN TaskSetManager: Lost TID 297 (task 1514.0:1) > > 14/07/03 21:14:20 WARN TaskSetManager: Lost TID 298 (task 1514.0:0) > > 14/07/03 21:14:20 ERROR TaskSetManager: Task 1514.0:0 failed 10 times; > aborting job > > 14/07/03 21:14:20 ERROR JobScheduler: Error running job streaming job > 1404436460000 ms.0 > > org.apache.spark.SparkException: Job aborted: Task 1514.0:0 failed 10 > times (most recent failure: Exception failure: java.io.IOException: No > FileSystem for scheme: hdfs) > > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020) > > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018) > > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > > at > scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > > at org.apache.spark.scheduler.DAGScheduler.org > $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018) > > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) > > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604) > > at scala.Option.foreach(Option.scala:236) > > at > org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604) > > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190) > > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > > >