[jira] [Created] (SPARK-16882) Failures in JobGenerator Thread are Swallowed, Job Does Not Fail

Brian Schrameck (JIRA) Wed, 03 Aug 2016 11:56:02 -0700

Brian Schrameck created SPARK-16882:
---------------------------------------


             Summary: Failures in JobGenerator Thread are Swallowed, Job Does 
Not Fail
                 Key: SPARK-16882
                 URL: https://issues.apache.org/jira/browse/SPARK-16882
             Project: Spark
          Issue Type: Bug
          Components: Scheduler, Streaming
    Affects Versions: 1.5.0
         Environment: CDH 5.6.1, CentOS 6.7
            Reporter: Brian Schrameck


Using the fileStream functionality and reading a directory with a large number 
of files over a long period of time, JVM garbage collection limits can be 
reached. In this case, the JobGenerator thread threw the exception, but it was 
completely swallowed and did not cause the job to fail. There were no errors in 
the ApplicationMaster, and the job just silently sat there not processing any 
further batches.

It would be expected that any fatal exception, not necessarily specific to this 
OutOfMemoryError, be handled appropriately and the job should be killed with 
the correct failure code.

We are running in YARN cluster mode on a CDH 5.6.1 cluster.

{noformat}Exception in thread "JobGenerator" java.lang.OutOfMemoryError: GC 
overhead limit exceeded
        at java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:68)
        at java.lang.StringBuilder.<init>(StringBuilder.java:89)
        at org.apache.hadoop.fs.Path.<init>(Path.java:109)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:430)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1494)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1534)
        at 
org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:569)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1494)
        at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1534)
        at 
org.apache.spark.streaming.dstream.FileInputDStream.findNewFiles(FileInputDStream.scala:195)
        at 
org.apache.spark.streaming.dstream.FileInputDStream.compute(FileInputDStream.scala:146)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:350)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:350)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:349)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:349)
        at 
org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:399)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:344)
        at 
org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:342)
        at scala.Option.orElse(Option.scala:257)
        at 
org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:339)
        at 
org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:38)
        at 
org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:120)
        at 
org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:120)
        at 
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
        at 
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
        at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
        at 
scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
        at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
        at 
org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:120)
        at 
org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:247){noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (SPARK-16882) Failures in JobGenerator Thread are Swallowed, Job Does Not Fail

Reply via email to