Re: Too many open files exception

Alexander Bezzubov Tue, 22 Dec 2015 20:02:17 -0800

Hi,

wellcome to the Zeppelin community!


It looks like you are doing everything right but have some
platform-specific issue, that Spark is hitting the limit of open files on
your OS.

This should not happen, so could you please check what is the current open
file limit on your environment/OS and (just in case) cross-check
spark-secific mailing list, in case that is some kind of known issues.

--
Alex

On Wed, Dec 23, 2015, 10:07 Amirhossein Aleyasin <amir.8...@gmail.com>
wrote:

> Hello,
> I am new to zeppelin, I just installed it and tried to run the tutorial
> example.
> The "load data into Table" part works perfect, but when I wanted to submit
> the sample queries, it throws the following exception:
>
>
> java.io.FileNotFoundException:
> /tmp/blockmgr-5d2c5999-5593-4f83-9d6d-3c290523ce29/3f/temp_shuffle_102ac16f-b5c6-4cc4-9c8e-b6bc66f17eb5
> (Too many open files) at java.io.FileOutputStream.open(Native Method) at
> java.io.FileOutputStream.<init>(FileOutputStream.java:221) at
> org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:88)
> at
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:110)
> at
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:88) at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
>
> This is the load table code:
>
> import org.apache.commons.io.IOUtils
> import java.net.URL
> import java.nio.charset.Charset
>
> // Zeppelin creates and injects sc (SparkContext) and sqlContext
> (HiveContext or SqlContext)
> // So you don't need create them manually
>
> // load bank data
> val bankText = sc.parallelize(
>     IOUtils.toString(
>         new URL("
> https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv";),
>         Charset.forName("utf8")).split("\n"))
>
> case class Bank(age: Integer, job: String, marital: String, education:
> String, balance: Integer)
>
> val bank = bankText.map(s => s.split(";")).filter(s => s(0) !=
> "\"age\"").map(
>     s => Bank(s(0).toInt,
>             s(1).replaceAll("\"", ""),
>             s(2).replaceAll("\"", ""),
>             s(3).replaceAll("\"", ""),
>             s(5).replaceAll("\"", "").toInt
>         )
> ).toDF()
> bank.registerTempTable("bank")
>
>
> and this is the query:
>
> %sql
> select age, count(1) value
> from bank
> where age < 30
> group by age
> order by age
>
>
> Any help appreciated.
>
> Thanks
>
>
>

Re: Too many open files exception

Reply via email to