Supreeth Sharma created ZEPPELIN-3114:
-----------------------------------------

             Summary: Notebooks and interpreters are not getting saved in 
zeppelin after >1d stress testing
                 Key: ZEPPELIN-3114
                 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3114
             Project: Zeppelin
          Issue Type: Bug
          Components: zeppelin-server
    Affects Versions: 0.7.3
            Reporter: Supreeth Sharma


Scenario:
36 hour long test
14 node secured encrypted cluster (centos7 based)
simulated load of around 13 users running a set of 19 notebooks periodically as 
per defined schedule

After 24 hours zeppelin stopped functioning.
Issue 1 :
Not able to create new notebook or update existing one.
Issue 2:
Not able to modify interpreter settings. Save action never gets completed on UI.
Issue 3:
Not able to run paragraphs.
Seeing below error in zeppelin logs :

{code}
WARN [2017-12-19 13:18:48,128] ({qtp1076835071-86681} Client.java[run]:715) - 
Exception encountered while connecting to the server : 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]
 INFO [2017-12-19 13:18:48,128] ({qtp1076835071-86681} 
RetryInvocationHandler.java[log]:280) - java.io.IOException: Failed on local 
exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]; Host Details : local host is: 
"ctr-e136-1513029738776-12293-01-000004.hwx.site/172.27.22.148"; destination 
host is: "ctr-e136-1513029738776-12293-01-000004.hwx.site":8020; , while 
invoking ClientNamenodeProtocolTranslatorPB.create over 
ctr-e136-1513029738776-12293-01-000004.hwx.site/172.27.22.148:8020 after 12 
failover attempts. Trying to failover after sleeping for 15905ms.
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to