Hi Qiuzhuang,

This is a linux related issue. Please go through this [1] and change the
limits. hope this will solve your problem.

[1] https://rtcamp.com/tutorials/linux/increase-open-files-limit/

On Thu, Nov 20, 2014 at 9:45 AM, Qiuzhuang Lian <qiuzhuang.l...@gmail.com>
wrote:

> Hi All,
>
> While doing some ETL, I  run into error of 'Too many open files' as
> following logs,
>
> Thanks,
> Qiuzhuang
>
> 4/11/20 20:12:02 INFO collection.ExternalAppendOnlyMap: Thread 63 spilling
> in-memory map of 100.8 KB to disk (953 times so far)
> 14/11/20 20:12:02 ERROR storage.DiskBlockObjectWriter: Uncaught exception
> while reverting partial writes to file
>
> /tmp/spark-local-20141120200455-4137/2f/temp_local_f83cbf2f-60a4-4fbd-b5d2-32a0c569311b
> java.io.FileNotFoundException:
>
> /tmp/spark-local-20141120200455-4137/2f/temp_local_f83cbf2f-60a4-4fbd-b5d2-32a0c569311b
> (Too many open files)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
>         at
>
> org.apache.spark.storage.DiskBlockObjectWriter.revertPartialWritesAndClose(BlockObjectWriter.scala:178)
>         at
>
> org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:203)
>         at
>
> org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:63)
>         at
>
> org.apache.spark.util.collection.Spillable$class.maybeSpill(Spillable.scala:77)
>         at
>
> org.apache.spark.util.collection.ExternalAppendOnlyMap.maybeSpill(ExternalAppendOnlyMap.scala:63)
>         at
>
> org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:131)
>         at
>
> org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$5.apply(CoGroupedRDD.scala:160)
>         at
>
> org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$5.apply(CoGroupedRDD.scala:159)
>         at
>
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>         at
>
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>         at
> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>         at
>
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>         at
> org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:159)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
>         at
> org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
>         at
>
> org.apache.spark.rdd.FlatMappedValuesRDD.compute(FlatMappedValuesRDD.scala:31)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
>         at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
>         at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>         at
> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:61)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:228)
>         at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
>         at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:230)
>         at
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)
> 14/11/20 20:12:02 ERROR executor.Executor: Exception in task 0.0 in stage
> 36.0 (TID 20)
> java.io.FileNotFoundException:
>
> /tmp/spark-local-20141120200455-4137/2f/temp_local_f83cbf2f-60a4-4fbd-b5d2-32a0c569311b
> (Too many open files)
>         at java.io.FileOutputStream.open(Native Method)
>         at java.io.FileOutputStream.<init>(FileOutputStream.java:221)
>         at
>
> org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123)
>         at
>
> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:192)
>         at
>
> org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:180)
>         at
>
> org.apache.spark.util.collection.ExternalAppendOnlyMap.spill(ExternalAppendOnlyMap.scala:63)
>         at
>
> org.apache.spark.util.collection.Spillable$class.maybeSpill(Spillable.scala:77)
>         at
>
> org.apache.spark.util.collection.ExternalAppendOnlyMap.maybeSpill(ExternalAppendOnlyMap.scala:63)
>         at
>
> org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:131)
>



-- 
Thanks & Best Regards,

*Dinesh J. Weerakkody*

Reply via email to