From the connection refused message I wonder if it is an SSL error.  I note 
none of the information for SSL (truststore, keystore, etc.)
I would think the YARN cluster requires some form of authentication.

On 4/7/19 9:27 AM, Jeff Zhang wrote:
It looks like the interpreter process can not connect to zeppelin server 
process. I guess it is due to some network issue, can you check whether the 
node in yarn cluster can connect to the zeppelin server host ?

Y. Ethan Guo <guoyi...@uber.com<mailto:guoyi...@uber.com>> 于2019年4月7日周日 
下午3:31写道:
Hi Jeff,

Given this PR is merged, I'm trying to see if I can run yarn cluster mode from 
master build.  I built Zeppelin master from this commit:

commit 3655c12b875884410224eca5d6155287d51916ac
Author: Jongyoul Lee <jongy...@gmail.com<mailto:jongy...@gmail.com>>
Date:   Mon Apr 1 15:37:57 2019 +0900
    [MINOR] Refactor CronJob class (#3335)

While I can successfully run Spark interpreter yarn client mode, I'm having 
trouble making the yarn cluster mode working.  Specifically, while the 
interpreter job was accepted in yarn, the job failed after 1-2 minutes because 
of this exception (see below).  Do you have any idea why this is happening?

DEBUG [2019-04-07 06:57:00,314] ({main} Logging.scala[logDebug]:58) - Created 
SSL options for fs: SSLOptions{enabled=false, keyStore=None, 
keyStorePassword=None, trustStore=None, trustStorePassword=None, protocol=None, 
enabledAlgorithms=Set()}
 INFO [2019-04-07 06:57:00,323] ({main} Logging.scala[logInfo]:54) - Starting 
the user application in a separate Thread
 INFO [2019-04-07 06:57:00,350] ({main} Logging.scala[logInfo]:54) - Waiting 
for spark context initialization...
 INFO [2019-04-07 06:57:00,403] ({Driver} 
RemoteInterpreterServer.java[<init>]:148) - Starting remote interpreter server 
on port 0, intpEventServerAddress: 172.17.0.1:45128<http://172.17.0.1:45128>
ERROR [2019-04-07 06:57:00,408] ({Driver} Logging.scala[logError]:91) - User 
class threw exception: org.apache.thrift.transport.TTransportException: 
java.net.ConnectException: Connection refused (Connection refused)
org.apache.thrift.transport.TTransportException: java.net.ConnectException: 
Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:154)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.<init>(RemoteInterpreterServer.java:139)
at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.main(RemoteInterpreterServer.java:285)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:635)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at org.apache.thrift.transport.TSocket.open(TSocket.java:221)
... 8 more

Thanks,
- Ethan

On Wed, Feb 27, 2019 at 4:24 PM Jeff Zhang 
<zjf...@gmail.com<mailto:zjf...@gmail.com>> wrote:
Here's the PR
https://github.com/apache/zeppelin/pull/3308

Y. Ethan Guo <guoyi...@uber.com<mailto:guoyi...@uber.com>> 于2019年2月28日周四 
上午2:50写道:
Hi All,

I'm trying to use the new feature of yarn cluster mode to run Spark 2.4.0 jobs 
on Zeppelin 0.8.1. I've set the SPARK_HOME, SPARK_SUBMIT_OPTIONS, and 
HADOOP_CONF_DIR env variables in zeppelin-env.sh so that the Spark interpreter 
can be started in the cluster. I used `--jars` in SPARK_SUBMIT_OPTIONS to add 
local jars. However, when I tried to import a class from the jars in a Spark 
paragraph, the interpreter complained that it cannot find the package and class 
("<console>:23: error: object ... is not a member of package ..."). Looks like 
the jars are not properly imported.

I followed the instruction 
here<https://zeppelin.apache.org/docs/0.8.1/interpreter/spark.html#2-loading-spark-properties>
 to add the jars, but it seems that it's not working in the cluster mode.  And 
this issue seems to be related to this bug: 
https://jira.apache.org/jira/browse/ZEPPELIN-3986.  Is there any update on 
fixing it? What is the right way to add local jars in yarn cluster mode? Any 
help and update are much appreciated.


Here's the SPARK_SUBMIT_OPTIONS I used (packages and jars paths omitted):

export SPARK_SUBMIT_OPTIONS="--driver-memory 12G --packages ... --jars ... 
--repositories 
https://repository.cloudera.com/artifactory/public/,https://repository.cloudera.com/content/repositories/releases/,http://repo.spring.io/plugins-release/";

Thanks,
- Ethan
--
Best,
- Ethan


--
Best Regards

Jeff Zhang


--
Best Regards

Jeff Zhang

--
========= mailto:db...@incadencecorp.com ============
David W. Boyd
VP,  Data Solutions
10432 Balls Ford, Suite 240
Manassas, VA 20109
office:   +1-703-552-2862
cell:     +1-703-402-7908
============== http://www.incadencecorp.com/ ============
ISO/IEC JTC1 WG9, editor ISO/IEC 20547 Big Data Reference Architecture
Chair ANSI/INCITS TC Big Data
Co-chair NIST Big Data Public Working Group Reference Architecture
First Robotic Mentor - FRC, FTC - 
www.iliterobotics.org<http://www.iliterobotics.org>
Board Member- USSTEM Foundation - www.usstem.org<http://www.usstem.org>

The information contained in this message may be privileged
and/or confidential and protected from disclosure.
If the reader of this message is not the intended recipient
or an employee or agent responsible for delivering this message
to the intended recipient, you are hereby notified that any
dissemination, distribution or copying of this communication
is strictly prohibited.  If you have received this communication
in error, please notify the sender immediately by replying to
this message and deleting the material from any computer.


Reply via email to