If *zeppelin.interpreter.connect.timeout *is reached, but the yarn app is
still in ACCEPTED state, then this should be a bug. The yarn app should be
killed it it can not be created in the timeout threashold

Sarthak Sharma <sarthak...@media.net> 于2018年11月20日周二 下午4:47写道:

> Hey,
>
> Like you mentioned, I'm already using the *spark.yarn.queue* parameter,
> hence I know which yarn queue it is getting scheduled in and this queue has
> resources available for applications since other apps are also getting
> scheduled there.
> However, assuming the queue does NOT have resources for it to schedule
> within the given time frame causing it to throw an exception after the 
> *zeppelin.interpreter.connect.timeout
> *is reached, the application should in any case get scheduled eventually
> which is not the case here. Interpreter driver process remains stuck in
> ACCEPTED state. Is there a change in the way it is implemented in this
> version ? Since we never experienced this on the previous one
> (zeppelin-0.7.3) where drivers would get scheduled eventually in their
> respective queues.
>
> On Tue, Nov 20, 2018, 7:29 AM Xun Liu <neliu...@163.com wrote:
>
>> HI,Sarthak Sharma
>>
>> The log shows that the task submitted by spark-submmit has been waiting
>> for execution in the queue of YARN. Is there no resource for the queue of
>> YARN?
>> You can specify a queue with resources in the spark interpreter via the
>> spark.yarn.queue parameter.
>>
>>
>> 在 2018年11月19日,下午7:41,Sarthak Sharma <sarthak...@media.net> 写道:
>>
>> Hi,
>>
>> We already have a zeppelin-0.7.3 setup which runs fine and is in use
>> currently but we are looking into the yarn cluster mode support for spark
>> interpreter in zeppelin-0.8. I've built it from source from *branch-0.8
>> (As of Nov-15) *and am facing the following issues intermittently in
>> some of the spark interpreters while trying to use spark-sql on it.
>>
>> *18/11/19 10:04:07 INFO yarn.Client: Submitting application
>> application_1542587655772_35129 to ResourceManager*
>> *18/11/19 10:04:07 INFO impl.YarnClientImpl: Submitted application
>> application_1542587655772_35129*
>> *18/11/19 10:04:08 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:08 INFO yarn.Client:*
>> *  client token: N/A*
>> *  diagnostics: N/A*
>> *  ApplicationMaster host: N/A*
>> *  ApplicationMaster RPC port: -1*
>> *  queue: root.zep*
>> *  start time: 1542621847537*
>> *  final status: UNDEFINED*
>> *  tracking
>> URL: http://resource-manager-addr/proxy/application_1542587655772_35129/
>> <http://c8-auto-hadoop-service-1.srv.media.net:8088/proxy/application_1542587655772_35129/>*
>> *  user: sarthak.sh*
>> *18/11/19 10:04:09 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:10 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:11 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:12 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:13 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:14 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:15 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:16 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:17 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:18 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:19 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:20 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:21 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:22 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:23 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:24 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:25 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:26 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:27 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:28 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:29 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:30 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:31 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:32 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:33 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:34 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:35 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:36 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:37 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:38 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:39 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:40 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:41 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:42 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:43 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:44 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:45 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:46 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:47 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:48 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:49 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:50 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:51 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:52 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:53 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:54 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:55 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:56 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:57 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:58 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:04:59 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:00 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:01 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:02 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:03 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:04 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:05 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:06 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:07 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:08 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:09 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:10 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:11 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:12 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:13 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:14 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:15 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:16 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:17 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:18 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:19 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:20 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:21 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:22 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:23 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:24 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:25 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:26 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:27 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:28 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:29 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:30 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:31 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:32 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:33 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:34 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:35 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:36 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:37 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:38 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:39 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:40 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:41 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:42 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:43 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:44 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:45 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:46 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:47 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:48 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:49 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:50 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:51 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:52 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:53 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:54 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:55 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:56 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:57 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:58 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>> *18/11/19 10:05:59 INFO yarn.Client: Application report for
>> application_1542587655772_35129 (state: ACCEPTED)*
>>
>> * at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:205)*
>> * at
>> org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:64)*
>> * at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:111)*
>> * at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:164)*
>> * at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:132)*
>> * at
>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:299)*
>> * at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:407)*
>> * at org.apache.zeppelin.scheduler.Job.run(Job.java:188)*
>> * at
>> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:315)*
>> * at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)*
>> * at java.util.concurrent.FutureTask.run(FutureTask.java:266)*
>> * at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)*
>> * at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)*
>> * at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*
>> * at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*
>> * at java.lang.Thread.run(Thread.java:748)*
>>
>> Any further submit to this interpreter will give null pointer exceptions
>> due to the absence of an interpreter process.
>> It looks like the interpreter driver process while getting submitted to
>> yarn, is stuck in ACCEPTED state because of which we're not able to connect
>> to the remote interpreter process. This happens even if there are resources
>> on the cluster in yarn.
>> Also I've tried increasing the *zeppelin.interpreter.connect.timeout *but
>> that didn't help since the application is stuck in ACCEPTED state
>> indefinitely and there are no logs available too.
>> It'll be great if you can point me to something that can help. Also
>> please do let me know if any configuration files are required for debugging
>> this.
>>
>>
>> Thanks and Regards
>>
>>
>> *Sarthak Sharma*
>> DevOps Engineer, Media.Net <http://media.net/>
>> +918002228376 | sarthak...@media.net
>> <http://en-gb.facebook.com/people/Sarthak-Sharma/100006006014244>
>> <http://in.linkedin.com/in/sarthaksharma96>
>>
>>
>>

-- 
Best Regards

Jeff Zhang

Reply via email to