what is your spark interpreter configuration ?

András Kolbert <kolbertand...@gmail.com>于2018年8月3日周五 下午4:24写道:

> Hi,
>
> We are experiencing issues with the Spark interpreter and Zeppelin's
> behaviour. Whenever we launch a note and we do not have a valid kerberos
> ticket, the application keeps trying to authenticate and does not time out.
> In this state, the interpreter cannot be restarted, the note cannot be
> cancelled. Only a whole application restart works.
>
> Is there any way around how to kill a particular execution, to get out
> from this loop?
> Thanks
> Andras
>
>
>
> log:
>
>
>
> INFO [2018-08-02 10:09:01,563] ({pool-2-thread-3}
> ConfiguredRMFailoverProxyProvider.java[performFailover]:100) - Failing over
> to rm119
>  WARN [2018-08-02 10:09:01,567] ({pool-2-thread-3}
> UserGroupInformation.java[doAs]:1920) - PriviledgedActionException
> as:zeppelin (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS
> initiate failed [Caused by GSSException: No valid credentials provided
> (Mechanism level: Failed to find any Kerberos tgt)]
>  WARN [2018-08-02 10:09:01,567] ({pool-2-thread-3} Client.java[run]:713) -
> Exception encountered while connecting to the server :
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
>  WARN [2018-08-02 10:09:01,568] ({pool-2-thread-3}
> UserGroupInformation.java[doAs]:1920) - PriviledgedActionException
> as:zeppelin (auth:KERBEROS) cause:java.io.IOException:
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
>  INFO [2018-08-02 10:09:01,570] ({pool-2-thread-3}
> RetryInvocationHandler.java[invoke]:150) - Exception while invoking
> getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm119
> after 3293 fail over attempts. Trying to fail over immediately.
> java.io.IOException: Failed on local exception: java.io.IOException:
> javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]; Host Details : local host is: "host.com/1.1.1.1";
> destination host is: "host.com":8032;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1508)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1441)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>         at com.sun.proxy.$Proxy17.getClusterMetrics(Unknown Source)
>         at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>         at com.sun.proxy.$Proxy18.getClusterMetrics(Unknown Source)
>         at
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:483)
>         at
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:158)
>         at
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:158)
>         at
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
>         at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:61)
>         at
> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:157)
>         at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
>         at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:165)
>         at org.apache.spark.SparkContext.<init>(SparkContext.scala:512)
>         at
> org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2511)
>         at
> org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
>         at
> org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
>         at scala.Option.getOrElse(Option.scala:121)
>         at
> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at
> org.apache.zeppelin.spark.BaseSparkScalaInterpreter.spark2CreateContext(BaseSparkScalaInterpreter.scala:189)
>         at
> org.apache.zeppelin.spark.BaseSparkScalaInterpreter.createSparkContext(BaseSparkScalaInterpreter.scala:124)
>         at
> org.apache.zeppelin.spark.SparkScala211Interpreter.open(SparkScala211Interpreter.scala:87)
>         at
> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:102)
>         at
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>         at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>         at
> org.apache.zeppelin.spark.PySparkInterpreter.getSparkInterpreter(PySparkInterpreter.java:664)
>         at
> org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:260)
>         at
> org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:194)
>         at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>         at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>         at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
>         at
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS
> initiate failed [Caused by GSSException: No valid credentials provided
> (Mechanism level: Failed to find any Kerberos tgt)]
>         at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:718)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
>         at
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:681)
>         at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:769)
>         at
> org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1480)
>         ... 48 more
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused
> by GSSException: No valid credentials provided (Mechanism level: Failed to
> find any Kerberos tgt)]
>         at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>         at
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
>         at
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:594)
>         at
> org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:396)
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:761)
>         at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:757)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
>         at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:756)
>         ... 51 more
> Caused by: GSSException: No valid credentials provided (Mechanism level:
> Failed to find any Kerberos tgt)
>         at
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
>         at
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:122)
>         at
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
>         at
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:224)
>         at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
>         at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
>         at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:192)
>         ... 60 more
>  INFO [2018-08-02 10:09:01,572] ({pool-2-thread-3}
> ConfiguredRMFailoverProxyProvider.java[performFailover]:100) - Failing over
> to rm112
>  INFO [2018-08-02 10:09:01,574] ({pool-2-thread-3}
> RetryInvocationHandler.java[invoke]:150) - Exception while invoking
> getClusterMetrics of class ApplicationClientProtocolPBClientImpl over rm112
> after 3294 fail over attempts. Trying to fail over after sleeping for
> 1903ms.
> java.net.ConnectException: Call From host.com/1.1.1.1 to host.com:8032
> failed on connection exception: java.net.ConnectException: Connection
> refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
>         at sun.reflect.GeneratedConstructorAccessor14.newInstance(Unknown
> Source)
>         at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>         at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>         at
> org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1508)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1441)
>         at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>         at com.sun.proxy.$Proxy17.getClusterMetrics(Unknown Source)
>         at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterMetrics(ApplicationClientProtocolPBClientImpl.java:202)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
>         at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>         at com.sun.proxy.$Proxy18.getClusterMetrics(Unknown Source)
>         at
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getYarnClusterMetrics(YarnClientImpl.java:483)
>         at
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:158)
>         at
> org.apache.spark.deploy.yarn.Client$$anonfun$submitApplication$1.apply(Client.scala:158)
>         at
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
>         at org.apache.spark.deploy.yarn.Client.logInfo(Client.scala:61)
>         at
> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:157)
>         at
> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
>         at
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:165)
>         at org.apache.spark.SparkContext.<init>(SparkContext.scala:512)
>         at
> org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2511)
>         at
> org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
>         at
> org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
>         at scala.Option.getOrElse(Option.scala:121)
>         at
> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:497)
>         at
> org.apache.zeppelin.spark.BaseSparkScalaInterpreter.spark2CreateContext(BaseSparkScalaInterpreter.scala:189)
>         at
> org.apache.zeppelin.spark.BaseSparkScalaInterpreter.createSparkContext(BaseSparkScalaInterpreter.scala:124)
>         at
> org.apache.zeppelin.spark.SparkScala211Interpreter.open(SparkScala211Interpreter.scala:87)
>         at
> org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:102)
>         at
> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
>         at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>         at
> org.apache.zeppelin.spark.PySparkInterpreter.getSparkInterpreter(PySparkInterpreter.java:664)
>         at
> org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:260)
>         at
> org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:194)
>         at
> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>         at
> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
>         at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
>         at
> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.net.ConnectException: Connection refused
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>         at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>         at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
>         at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
>         at
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:648)
>         at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:744)
>         at
> org.apache.hadoop.ipc.Client$Connection.access$3000(Client.java:396)
>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:1557)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1480)
>         ... 48 more
>

Reply via email to