Re: Why does zeppelin try to do during web application startup?

2017-03-26 Thread Serega Sheypak
Cool, is there any possibility to pre-download and bundle it in app somehow? 2017-03-26 16:46 GMT+02:00 Иван Шаповалов : > As a part of application Helium is downloading and installing npm & node > of a version it needs during the startup. This greatly increases startup > and may be one of the re

Re: Setting Zeppelin to work with multiple Hadoop clusters when running Spark.

2017-03-26 Thread Serega Sheypak
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html You don't have to rely on single NN. You can specify a kind of "NN HA alias" and underlying HDFS client would connect to NN which is active right now. Thanks for pointing HADOOP_CONF_DIR, seems li

Re: Why does zeppelin try to do during web application startup?

2017-03-26 Thread Иван Шаповалов
As a part of application Helium is downloading and installing npm & node of a version it needs during the startup. This greatly increases startup and may be one of the reasons. 2017-03-26 14:51 GMT+03:00 Serega Sheypak : > Hi, I'm trying run Zeppelin 0.8.0-SNAPSHOT in Docker. Startup takes > fore

Re: Setting Zeppelin to work with multiple Hadoop clusters when running Spark.

2017-03-26 Thread Jianfeng (Jeff) Zhang
What do you mean non-reliable ? If you want to read/write 2 hadoop cluster in one program, I am afraid this is the only way. It is impossible to specify multiple HADOOP_CONF_DIR under one jvm classpath. Only one default configuration will be used. Best Regard, Jeff Zhang From: Serega Sheypa

Re: Zeppelin out of memory issue - (GC overhead limit exceeded)

2017-03-26 Thread Jianfeng (Jeff) Zhang
I verify it in master branch, it works for me. Set it in interpreter setting page as following. [cid:8CB49F76-39F5-4A53-816B-9E47F7993050] Best Regard, Jeff Zhang From: RUSHIKESH RAUT mailto:rushikeshraut...@gmail.com>> Reply-To: "users@zeppelin.apache.org

Re: Zeppelin out of memory issue - (GC overhead limit exceeded)

2017-03-26 Thread RUSHIKESH RAUT
Thanks Jianfeng, But i am still not able to solve the issue. I have set it to 4g but still no luck.Can you please explain it to me how can I set SPARK_DRIVER_MEMORY property. Also as I have read that GC overhead limit exceeded error occurs when the heap memory is insufficient. So How can I increa

Why does zeppelin try to do during web application startup?

2017-03-26 Thread Serega Sheypak
Hi, I'm trying run Zeppelin 0.8.0-SNAPSHOT in Docker. Startup takes forever. It starts in seconds when launched on host, not in Docker container. I suspect Docker container has poorly configured network and some part of zeppelin tries to reach remote resource. SLF4J: See http://www.slf4j.org/codes

Re: Setting Zeppelin to work with multiple Hadoop clusters when running Spark.

2017-03-26 Thread Serega Sheypak
I know it, thanks, but it's non reliable solution. 2017-03-26 5:23 GMT+02:00 Jianfeng (Jeff) Zhang : > > You can try to specify the namenode address for hdfs file. e.g > > spark.read.csv(“hdfs://localhost:9009/file”) > > Best Regard, > Jeff Zhang > > > From: Serega Sheypak > Reply-To: "users@zep

Re: Zeppelin out of memory issue - (GC overhead limit exceeded)

2017-03-26 Thread Jianfeng (Jeff) Zhang
This is a bug of zeppelin. spark.driver.memory won't take effect. As for now it isn't passed to spark through -conf parameter. See https://issues.apache.org/jira/browse/ZEPPELIN-1263 The workaround is to specify SPARK_DRIVER_MEMORY in interpreter setting page. Best Regard, Jeff Zhang From:

Re: Zeppelin out of memory issue - (GC overhead limit exceeded)

2017-03-26 Thread RUSHIKESH RAUT
I tried setting it as spark.driver.memory 4g But still it is giving same error so tried it with -X. Now I have removed it But as per my understanding this is the spark driver memory, I want to increase the heap size used by the interpreter. Because when I run *ps aux | grep zeppelin* on my machi

Re: Zeppelin out of memory issue - (GC overhead limit exceeded)

2017-03-26 Thread Eric Charles
You don't have to set spark.driver.memory with -X... but simply with memory size. Look at http://spark.apache.org/docs/latest/configuration.html spark.driver.memory 1g Amount of memory to use for the driver process, i.e. where SparkContext is initialized. (e.g. 1g, 2g). Note: In client mode,

Re: Zeppelin out of memory issue - (GC overhead limit exceeded)

2017-03-26 Thread RUSHIKESH RAUT
What value should I set there? Currently I have set it as spark.driver.memory -Xms4096m -Xmx4096m -XX:MaxPermSize=2048m But still same error On Mar 26, 2017 1:19 PM, "Eric Charles" wrote: > You also have to check the memory you give to the spark driver > (spark.driver.memory property) > > On

Re: Zeppelin out of memory issue - (GC overhead limit exceeded)

2017-03-26 Thread Eric Charles
You also have to check the memory you give to the spark driver (spark.driver.memory property) On 26/03/17 07:40, RUSHIKESH RAUT wrote: Yes I know it inevitable if the data is large. I want to know how do I increase the interpreter memory to handle large data? Thanks, Rushikesh Raut On Mar 26,

Re: Zeppelin hangs in air-gapped environments

2017-03-26 Thread Eric Charles
Running off-line (or in closed data centers) is now super-slow start since HeliumBundleFactory connects to npms.org web site on startup. If dependent website are not reachable, we should simply fast skip that step. What about adding a check for this? On 24/03/17 15:16, Raffaele S wrote: Hell