Re: Spark Streaming w/ tshark exception problem on EC2

Tathagata Das Tue, 15 Jul 2014 19:03:14 -0700

Quick google search of that exception says this occurs when there is an
error in the initialization of static methods. Could be some issue related
to how dissection is defined. Maybe try putting the function in a different
static class that is unrelated to the Main class, which may have other
static initialization stuff? Its hard to say much more without any more
context.


TD


On Tue, Jul 15, 2014 at 2:17 AM, Gianluca Privitera <
gianluca.privite...@studio.unibo.it> wrote:

>  Hi,
> I’ve got a problem with Spark Streaming and tshark.
> While I’m running locally I have no problems with this code, but when I
> run it on a EC2 cluster I get the exception shown just under the code.
>
>  def dissection(s: String): Seq[String] = {
>     try {
>
>        Process("hadoop command to create ./localcopy.tmp").! // calls
> hadoop to copy a file from s3 locally
>        val pb = Process(“tshark … localcopy.tmp”)  // calls tshark to
> transform the s3 file into sequence of strings
>       var returnValue = pb.lines_!.toSeq
>        return returnValue
>
>      } catch {
>       case e: Exception =>
>          System.err.println(“ERROR")
>         return new MutableList[String]()
>     }
>   }
>
>  (line 2051 points to the function “dissection”)
>
>  WARN scheduler.TaskSetManager: Loss was due to
> java.lang.ExceptionInInitializerError
> java.lang.ExceptionInInitializerError
> at Main$$anonfun$11.apply(Main.scala:2051)
> at Main$$anonfun$11.apply(Main.scala:2051)
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
> at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
> at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96)
> at
> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95)
> at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
> at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
> at org.apache.spark.scheduler.Task.run(Task.scala:51)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:183)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:744)
>
>  Has anyone got an idea why that may happen? I’m pretty sure that the
> hadoop call works perfectly.
>
>  Thanks
> Gianluca
>

Re: Spark Streaming w/ tshark exception problem on EC2

Reply via email to