Hi, wellcome to the Zeppelin community!
It looks like you are doing everything right but have some platform-specific issue, that Spark is hitting the limit of open files on your OS. This should not happen, so could you please check what is the current open file limit on your environment/OS and (just in case) cross-check spark-secific mailing list, in case that is some kind of known issues. -- Alex On Wed, Dec 23, 2015, 10:07 Amirhossein Aleyasin <amir.8...@gmail.com> wrote: > Hello, > I am new to zeppelin, I just installed it and tried to run the tutorial > example. > The "load data into Table" part works perfect, but when I wanted to submit > the sample queries, it throws the following exception: > > > java.io.FileNotFoundException: > /tmp/blockmgr-5d2c5999-5593-4f83-9d6d-3c290523ce29/3f/temp_shuffle_102ac16f-b5c6-4cc4-9c8e-b6bc66f17eb5 > (Too many open files) at java.io.FileOutputStream.open(Native Method) at > java.io.FileOutputStream.<init>(FileOutputStream.java:221) at > org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:88) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:110) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:88) at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > > This is the load table code: > > import org.apache.commons.io.IOUtils > import java.net.URL > import java.nio.charset.Charset > > // Zeppelin creates and injects sc (SparkContext) and sqlContext > (HiveContext or SqlContext) > // So you don't need create them manually > > // load bank data > val bankText = sc.parallelize( > IOUtils.toString( > new URL(" > https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv"), > Charset.forName("utf8")).split("\n")) > > case class Bank(age: Integer, job: String, marital: String, education: > String, balance: Integer) > > val bank = bankText.map(s => s.split(";")).filter(s => s(0) != > "\"age\"").map( > s => Bank(s(0).toInt, > s(1).replaceAll("\"", ""), > s(2).replaceAll("\"", ""), > s(3).replaceAll("\"", ""), > s(5).replaceAll("\"", "").toInt > ) > ).toDF() > bank.registerTempTable("bank") > > > and this is the query: > > %sql > select age, count(1) value > from bank > where age < 30 > group by age > order by age > > > Any help appreciated. > > Thanks > > >