Re: Zeppelin Kerberos error

Jongyoul Lee Wed, 31 Aug 2016 10:50:26 -0700

I think it's related to https://issues.apache.org/jira/browse/ZEPPELIN-1175
which remove some class path when Zeppelin launches interpreter. Could you
please check your hive-site.xml is included in your interpreter process? It
looks like a configuration issue because you can see the default database.
If it doesn't exists, you should copy your xml into interpreter/spark/dep/


Regards,
JL

On Wed, Aug 31, 2016 at 9:52 PM, Pradeep Reddy <pradeepreddy.a...@gmail.com>
wrote:

> Hi Jongyoul- I followed the exact same steps for compiling and setting up
> the new build from source as 0.5.6 (only difference is, I acquired the
> source for latest build using "git clone")
>
> hive-site.xml was copied to conf directory. But, the spark interpreter is
> not talking to the hive metastore. Both the 0.5.6 & the latest builds are
> running in the same machine. In 0.5.6 when i run the below command, I see
> 116 databases listed, as per my expectations and I'm able to run my
> notebooks built on those databases.
>
> [image: Inline image 1]
>
> Thanks,
> Pradeep
>
>
> On Wed, Aug 31, 2016 at 2:52 AM, Jongyoul Lee <jongy...@gmail.com> wrote:
>
>> Hello,
>>
>> Do you copy your hive-site.xml in a proper position?
>>
>> On Wed, Aug 31, 2016 at 3:52 PM, Pradeep Reddy <
>> pradeepreddy.a...@gmail.com> wrote:
>>
>>> nothing obvious. I will stick to 0.5.6 build, until the latest builds
>>> stabilize.
>>>
>>> On Wed, Aug 31, 2016 at 1:39 AM, Jeff Zhang <zjf...@gmail.com> wrote:
>>>
>>>> Then I guess maybe you are connecting to different database. Why not
>>>> using  'z.show(sql("databases"))' to display the databases ? Then you
>>>> will get a hint what's going on.
>>>>
>>>> On Wed, Aug 31, 2016 at 2:36 PM, Pradeep Reddy <
>>>> pradeepreddy.a...@gmail.com> wrote:
>>>>
>>>>> Yes...I didn't wish to show the names of the databases that we have in
>>>>> our data lake on that screen shot. so thats why I chose to display the
>>>>> count. The latest zeppelin build just shows 1 count which is "default"
>>>>> database.
>>>>>
>>>>> Thanks,
>>>>> Pradeep
>>>>>
>>>>> On Wed, Aug 31, 2016 at 1:33 AM, Jeff Zhang <zjf...@gmail.com> wrote:
>>>>>
>>>>>> 116 is the databases count number. Do you expect a list of database ?
>>>>>> then you need to use 'z.show(sql("databases"))'
>>>>>>
>>>>>> On Wed, Aug 31, 2016 at 2:26 PM, Pradeep Reddy <
>>>>>> pradeepreddy.a...@gmail.com> wrote:
>>>>>>
>>>>>>> Here it is Jeff
>>>>>>>
>>>>>>> [image: Inline image 1]
>>>>>>>
>>>>>>> On Wed, Aug 31, 2016 at 1:24 AM, Jeff Zhang <zjf...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Pradeep,
>>>>>>>>
>>>>>>>> I don't see the databases on your screenshot (second one for
>>>>>>>> 0.5.6). I think the output is correct.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Aug 31, 2016 at 12:55 PM, Pradeep Reddy <
>>>>>>>> pradeepreddy.a...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi Jeff- I was able to make Kerberos work in 0.5.6 zeppelin build.
>>>>>>>>> It seems like Kerberos not working & spark not able to talk to the 
>>>>>>>>> shared
>>>>>>>>> hive meta store are defects in the current build.
>>>>>>>>>
>>>>>>>>> On Tue, Aug 30, 2016 at 11:09 PM, Pradeep Reddy <
>>>>>>>>> pradeepreddy.a...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Jeff-
>>>>>>>>>>
>>>>>>>>>> I switched to local mode now, I'm able to summon the implicit
>>>>>>>>>> objects like sc, sqlContext etc., but it doesn't show my databases &
>>>>>>>>>> tables, just shows 1 database "default".
>>>>>>>>>>
>>>>>>>>>> Zeppelin Latest Build
>>>>>>>>>>
>>>>>>>>>> [image: Inline image 3]
>>>>>>>>>>
>>>>>>>>>> Zeppelin 0.5.6, running on the same machine, is able to show my
>>>>>>>>>> databases and tables.
>>>>>>>>>>
>>>>>>>>>> [image: Inline image 4]
>>>>>>>>>>
>>>>>>>>>> On Tue, Aug 30, 2016 at 8:20 PM, Jeff Zhang <zjf...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> > the spark interpreter is not showing my tables & databases,
>>>>>>>>>>> may be its running in an isolated mode... I'm just getting empty 
>>>>>>>>>>> list, so I
>>>>>>>>>>> attempted to do kerberos authentication to workaround that issue, 
>>>>>>>>>>> and
>>>>>>>>>>> bumped into this road block.
>>>>>>>>>>>
>>>>>>>>>>> kerberos would not help here, actually I think it would make the
>>>>>>>>>>> problem more complicated.  You need to first check the log why you 
>>>>>>>>>>> get
>>>>>>>>>>> empty list.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 31, 2016 at 8:56 AM, Pradeep Reddy <
>>>>>>>>>>> pradeepreddy.a...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Jeff- I was also successfully able to run spark shell, after
>>>>>>>>>>>> running kdestroy, with the below command and was able to get to my 
>>>>>>>>>>>> hive
>>>>>>>>>>>> tables.
>>>>>>>>>>>>
>>>>>>>>>>>> spark-shell --conf spark.yarn.keytab=$HOME/pradeep.x.alla.keytab
>>>>>>>>>>>> --conf spark.yarn.principal=pradeep.x.alla --deploy-mode
>>>>>>>>>>>> client --master yarn --queue <QUEUE_NAME>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 30, 2016 at 7:34 PM, Pradeep Reddy <
>>>>>>>>>>>> pradeepreddy.a...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks Jeff..I have always used zeppelin in local mode, but
>>>>>>>>>>>>> when I migrated from 0.5.6 to this version, the spark interpreter 
>>>>>>>>>>>>> is not
>>>>>>>>>>>>> showing my tables & databases, may be its running in an isolated 
>>>>>>>>>>>>> mode...
>>>>>>>>>>>>> I'm just getting empty list, so I attempted to do kerberos 
>>>>>>>>>>>>> authentication
>>>>>>>>>>>>> to workaround that issue, and bumped into this road block.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Below is the configuration, I also tested my keytab file and
>>>>>>>>>>>>> its working fine.
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Kerberos test:*
>>>>>>>>>>>>> $ kdestroy
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ klist
>>>>>>>>>>>>> *klist: No credentials cache found (ticket cache
>>>>>>>>>>>>> FILE:/tmp/krb5cc_12027)*
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ kinit -kt pradeep_x_alla.keytab -V pradeep.x.alla
>>>>>>>>>>>>> *Using default cache: /tmp/krb5cc_12027*
>>>>>>>>>>>>> *Using principal: pradeep.x.alla@<DOMAIN1>*
>>>>>>>>>>>>> *Using keytab: pradeep_x_alla.keytab*
>>>>>>>>>>>>> *Authenticated to Kerberos v5*
>>>>>>>>>>>>>
>>>>>>>>>>>>> $ klist
>>>>>>>>>>>>> *Ticket cache: FILE:/tmp/krb5cc_12027*
>>>>>>>>>>>>> *Default principal: pradeep.x.alla@<DOMAIN1>*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Valid starting     Expires            Service principal*
>>>>>>>>>>>>> *08/30/16 20:25:19  08/31/16 06:25:19
>>>>>>>>>>>>>  krbtgt/<DOMAIN1>@<DOMAIN1>*
>>>>>>>>>>>>> *        renew until 08/31/16 20:25:19*
>>>>>>>>>>>>>
>>>>>>>>>>>>> *zeppelin-env.sh*
>>>>>>>>>>>>>
>>>>>>>>>>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf:/etc/hive/conf
>>>>>>>>>>>>> export SPARK_HOME=/opt/cloudera/parce
>>>>>>>>>>>>> ls/CDH-5.5.1-1.cdh5.5.1.p0.11/lib/spark
>>>>>>>>>>>>> export SPARK_SUBMIT_OPTIONS="--deploy-mode client --master
>>>>>>>>>>>>> yarn --num-executors 2 --executor-memory 2g --queue <QUEUE_NAME>"
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Interpreter.json (Spark interpreter config)*
>>>>>>>>>>>>> "2BUTFVN89": {
>>>>>>>>>>>>>       "id": "2BUTFVN89",
>>>>>>>>>>>>>       "name": "spark",
>>>>>>>>>>>>>       "group": "spark",
>>>>>>>>>>>>>       "properties": {
>>>>>>>>>>>>>         "spark.cores.max": "",
>>>>>>>>>>>>>         "zeppelin.spark.printREPLOutput": "true",
>>>>>>>>>>>>>         "master": "yarn-client",
>>>>>>>>>>>>>         "zeppelin.spark.maxResult": "1000",
>>>>>>>>>>>>>         "zeppelin.dep.localrepo": "local-repo",
>>>>>>>>>>>>>         "spark.app.name": "Zeppelin",
>>>>>>>>>>>>>         "spark.executor.memory": "",
>>>>>>>>>>>>>         "zeppelin.spark.importImplicit": "true",
>>>>>>>>>>>>>         "zeppelin.spark.sql.stacktrace": "true",
>>>>>>>>>>>>>         "zeppelin.spark.useHiveContext": "true",
>>>>>>>>>>>>>         "zeppelin.interpreter.localRepo":
>>>>>>>>>>>>> "/home/pradeep.x.alla/zeppelin/local-repo/2BUTFVN89",
>>>>>>>>>>>>>         "zeppelin.spark.concurrentSQL": "false",
>>>>>>>>>>>>>         "args": "",
>>>>>>>>>>>>>         "zeppelin.pyspark.python": "python",
>>>>>>>>>>>>>         "spark.yarn.keytab": "/home/pradeep.x.alla/pradeep.
>>>>>>>>>>>>> x.alla.keytab",
>>>>>>>>>>>>>         "spark.yarn.principal": "pradeep.x.alla",
>>>>>>>>>>>>>         "zeppelin.dep.additionalRemoteRepository":
>>>>>>>>>>>>> "spark-packages,http://dl.bintray.com/spark-packages/maven,f
>>>>>>>>>>>>> alse;"
>>>>>>>>>>>>>       },
>>>>>>>>>>>>>       "status": "READY",
>>>>>>>>>>>>>       "interpreterGroup": [
>>>>>>>>>>>>>         {
>>>>>>>>>>>>>           "name": "spark",
>>>>>>>>>>>>>           "class": "org.apache.zeppelin.spark.Spa
>>>>>>>>>>>>> rkInterpreter",
>>>>>>>>>>>>>           "defaultInterpreter": true
>>>>>>>>>>>>>         },
>>>>>>>>>>>>>         {
>>>>>>>>>>>>>           "name": "sql",
>>>>>>>>>>>>>           "class": "org.apache.zeppelin.spark.Spa
>>>>>>>>>>>>> rkSqlInterpreter",
>>>>>>>>>>>>>           "defaultInterpreter": false
>>>>>>>>>>>>>         },
>>>>>>>>>>>>>         {
>>>>>>>>>>>>>           "name": "dep",
>>>>>>>>>>>>>           "class": "org.apache.zeppelin.spark.DepInterpreter",
>>>>>>>>>>>>>           "defaultInterpreter": false
>>>>>>>>>>>>>         },
>>>>>>>>>>>>>         {
>>>>>>>>>>>>>           "name": "pyspark",
>>>>>>>>>>>>>           "class": "org.apache.zeppelin.spark.PyS
>>>>>>>>>>>>> parkInterpreter",
>>>>>>>>>>>>>           "defaultInterpreter": false
>>>>>>>>>>>>>         }
>>>>>>>>>>>>>       ],
>>>>>>>>>>>>>       "dependencies": [],
>>>>>>>>>>>>>       "option": {
>>>>>>>>>>>>>         "remote": true,
>>>>>>>>>>>>>         "port": -1,
>>>>>>>>>>>>>         "perNoteSession": false,
>>>>>>>>>>>>>         "perNoteProcess": false,
>>>>>>>>>>>>>         "isExistingProcess": false,
>>>>>>>>>>>>>         "setPermission": false,
>>>>>>>>>>>>>         "users": []
>>>>>>>>>>>>>       }
>>>>>>>>>>>>>     }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Aug 30, 2016 at 6:52 PM, Jeff Zhang <zjf...@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks like your kerberos configuration issue. Do you mind
>>>>>>>>>>>>>> to share your configuration ? Or you can first try to run 
>>>>>>>>>>>>>> spark-shell using
>>>>>>>>>>>>>> spark.yarn.keytab & spark.yarn.principle to verify them.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Aug 31, 2016 at 6:12 AM, Pradeep Reddy <
>>>>>>>>>>>>>> pradeepreddy.a...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi- I recently built zeppelin from source and configured
>>>>>>>>>>>>>>> kerberos authentication. For Kerberos I added 
>>>>>>>>>>>>>>> "spark.yarn.keytab" &
>>>>>>>>>>>>>>> "spark.yarn.principal" and also set master to "yarn-client".  
>>>>>>>>>>>>>>> But I keep
>>>>>>>>>>>>>>> getting this error whenever I use spark interpreter in the 
>>>>>>>>>>>>>>> notebook
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3536728 started by scheduler org.apache.zeppelin.spark.Spar
>>>>>>>>>>>>>>> kInterpreter335845091
>>>>>>>>>>>>>>> ERROR [2016-08-30 17:45:37,237] ({pool-2-thread-2}
>>>>>>>>>>>>>>> Job.java[run]:189) - Job failed
>>>>>>>>>>>>>>> java.lang.IllegalArgumentException: Invalid rule: L
>>>>>>>>>>>>>>> RULE:[2:$1@$0](.*@\Q<DOMAIN1>.COM\E$)s/@\Q<DOMAIN1>\E$//L
>>>>>>>>>>>>>>> RULE:[1:$1@$0](.*@\Q<DOMAIN2>\E$)s/@\Q<DOMAIN2>\E$//L
>>>>>>>>>>>>>>> RULE:[2:$1@$0](.*@\Q<DOMAIN2>\E$)s/@\Q<DOMAIN2>\E$//L
>>>>>>>>>>>>>>> DEFAULT
>>>>>>>>>>>>>>>         at org.apache.hadoop.security.aut
>>>>>>>>>>>>>>> hentication.util.KerberosName.parseRules(KerberosName.java:3
>>>>>>>>>>>>>>> 21)
>>>>>>>>>>>>>>>         at org.apache.hadoop.security.aut
>>>>>>>>>>>>>>> hentication.util.KerberosName.setRules(KerberosName.java:386
>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>         at org.apache.hadoop.security.Had
>>>>>>>>>>>>>>> oopKerberosName.setConfiguration(HadoopKerberosName.java:75)
>>>>>>>>>>>>>>>         at org.apache.hadoop.security.Use
>>>>>>>>>>>>>>> rGroupInformation.initialize(UserGroupInformation.java:227)
>>>>>>>>>>>>>>>         at org.apache.hadoop.security.Use
>>>>>>>>>>>>>>> rGroupInformation.ensureInitialized(UserGroupInformation.jav
>>>>>>>>>>>>>>> a:214)
>>>>>>>>>>>>>>>         at org.apache.hadoop.security.Use
>>>>>>>>>>>>>>> rGroupInformation.isAuthenticationMethodEnabled(UserGroupInf
>>>>>>>>>>>>>>> ormation.java:275)
>>>>>>>>>>>>>>>         at org.apache.hadoop.security.Use
>>>>>>>>>>>>>>> rGroupInformation.isSecurityEnabled(UserGroupInformation.jav
>>>>>>>>>>>>>>> a:269)
>>>>>>>>>>>>>>>         at org.apache.hadoop.security.Use
>>>>>>>>>>>>>>> rGroupInformation.loginUserFromKeytab(UserGroupInformation.j
>>>>>>>>>>>>>>> ava:820)
>>>>>>>>>>>>>>>         at org.apache.zeppelin.spark.Spar
>>>>>>>>>>>>>>> kInterpreter.open(SparkInterpreter.java:539)
>>>>>>>>>>>>>>>         at org.apache.zeppelin.interprete
>>>>>>>>>>>>>>> r.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
>>>>>>>>>>>>>>>         at org.apache.zeppelin.interprete
>>>>>>>>>>>>>>> r.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93)
>>>>>>>>>>>>>>>         at org.apache.zeppelin.interprete
>>>>>>>>>>>>>>> r.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteI
>>>>>>>>>>>>>>> nterpreterServer.java:383)
>>>>>>>>>>>>>>>         at org.apache.zeppelin.scheduler.
>>>>>>>>>>>>>>> Job.run(Job.java:176)
>>>>>>>>>>>>>>>         at org.apache.zeppelin.scheduler.
>>>>>>>>>>>>>>> FIFOScheduler$1.run(FIFOScheduler.java:139)
>>>>>>>>>>>>>>>         at java.util.concurrent.Executors
>>>>>>>>>>>>>>> $RunnableAdapter.call(Executors.java:511)
>>>>>>>>>>>>>>>         at java.util.concurrent.FutureTas
>>>>>>>>>>>>>>> k.run(FutureTask.java:266)
>>>>>>>>>>>>>>>         at java.util.concurrent.Scheduled
>>>>>>>>>>>>>>> ThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledT
>>>>>>>>>>>>>>> hreadPoolExecutor.java:180)
>>>>>>>>>>>>>>>         at java.util.concurrent.Scheduled
>>>>>>>>>>>>>>> ThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPo
>>>>>>>>>>>>>>> olExecutor.java:293)
>>>>>>>>>>>>>>>         at java.util.concurrent.ThreadPoo
>>>>>>>>>>>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>>>>>>>>>>         at java.util.concurrent.ThreadPoo
>>>>>>>>>>>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>>>>>>>>>>         at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>>>>>>>  INFO [2016-08-30 17:45:37,247] ({pool-2-thread-2}
>>>>>>>>>>>>>>> SchedulerFactory.java[jobFinished]:137) - Job
>>>>>>>>>>>>>>> remoteInterpretJob_1472593536728 finished by scheduler
>>>>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter335845091
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Pradeep
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best Regards
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jeff Zhang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Best Regards
>>>>>>>>>>>
>>>>>>>>>>> Jeff Zhang
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best Regards
>>>>>>>>
>>>>>>>> Jeff Zhang
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards
>>>>>>
>>>>>> Jeff Zhang
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards
>>>>
>>>> Jeff Zhang
>>>>
>>>
>>>
>>
>>
>> --
>> 이종열, Jongyoul Lee, 李宗烈
>> http://madeng.net
>>
>
>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Re: Zeppelin Kerberos error

Reply via email to