Hi Jeff- I was able to make Kerberos work in 0.5.6 zeppelin build. It seems like Kerberos not working & spark not able to talk to the shared hive meta store are defects in the current build.
On Tue, Aug 30, 2016 at 11:09 PM, Pradeep Reddy <pradeepreddy.a...@gmail.com > wrote: > Hi Jeff- > > I switched to local mode now, I'm able to summon the implicit objects like > sc, sqlContext etc., but it doesn't show my databases & tables, just shows > 1 database "default". > > Zeppelin Latest Build > > [image: Inline image 3] > > Zeppelin 0.5.6, running on the same machine, is able to show my databases > and tables. > > [image: Inline image 4] > > On Tue, Aug 30, 2016 at 8:20 PM, Jeff Zhang <zjf...@gmail.com> wrote: > >> > the spark interpreter is not showing my tables & databases, may be its >> running in an isolated mode... I'm just getting empty list, so I attempted >> to do kerberos authentication to workaround that issue, and bumped into >> this road block. >> >> kerberos would not help here, actually I think it would make the problem >> more complicated. You need to first check the log why you get empty list. >> >> On Wed, Aug 31, 2016 at 8:56 AM, Pradeep Reddy < >> pradeepreddy.a...@gmail.com> wrote: >> >>> Jeff- I was also successfully able to run spark shell, after running >>> kdestroy, with the below command and was able to get to my hive tables. >>> >>> spark-shell --conf spark.yarn.keytab=$HOME/pradeep.x.alla.keytab --conf >>> spark.yarn.principal=pradeep.x.alla --deploy-mode client --master yarn >>> --queue <QUEUE_NAME> >>> >>> On Tue, Aug 30, 2016 at 7:34 PM, Pradeep Reddy < >>> pradeepreddy.a...@gmail.com> wrote: >>> >>>> Thanks Jeff..I have always used zeppelin in local mode, but when I >>>> migrated from 0.5.6 to this version, the spark interpreter is not showing >>>> my tables & databases, may be its running in an isolated mode... I'm just >>>> getting empty list, so I attempted to do kerberos authentication to >>>> workaround that issue, and bumped into this road block. >>>> >>>> Below is the configuration, I also tested my keytab file and its >>>> working fine. >>>> >>>> *Kerberos test:* >>>> $ kdestroy >>>> >>>> $ klist >>>> *klist: No credentials cache found (ticket cache >>>> FILE:/tmp/krb5cc_12027)* >>>> >>>> $ kinit -kt pradeep_x_alla.keytab -V pradeep.x.alla >>>> *Using default cache: /tmp/krb5cc_12027* >>>> *Using principal: pradeep.x.alla@<DOMAIN1>* >>>> *Using keytab: pradeep_x_alla.keytab* >>>> *Authenticated to Kerberos v5* >>>> >>>> $ klist >>>> *Ticket cache: FILE:/tmp/krb5cc_12027* >>>> *Default principal: pradeep.x.alla@<DOMAIN1>* >>>> >>>> *Valid starting Expires Service principal* >>>> *08/30/16 20:25:19 08/31/16 06:25:19 krbtgt/<DOMAIN1>@<DOMAIN1>* >>>> * renew until 08/31/16 20:25:19* >>>> >>>> *zeppelin-env.sh* >>>> >>>> export HADOOP_CONF_DIR=/etc/hadoop/conf:/etc/hive/conf >>>> export SPARK_HOME=/opt/cloudera/parcels/CDH-5.5.1-1.cdh5.5.1.p0.11/ >>>> lib/spark >>>> export SPARK_SUBMIT_OPTIONS="--deploy-mode client --master yarn >>>> --num-executors 2 --executor-memory 2g --queue <QUEUE_NAME>" >>>> >>>> >>>> *Interpreter.json (Spark interpreter config)* >>>> "2BUTFVN89": { >>>> "id": "2BUTFVN89", >>>> "name": "spark", >>>> "group": "spark", >>>> "properties": { >>>> "spark.cores.max": "", >>>> "zeppelin.spark.printREPLOutput": "true", >>>> "master": "yarn-client", >>>> "zeppelin.spark.maxResult": "1000", >>>> "zeppelin.dep.localrepo": "local-repo", >>>> "spark.app.name": "Zeppelin", >>>> "spark.executor.memory": "", >>>> "zeppelin.spark.importImplicit": "true", >>>> "zeppelin.spark.sql.stacktrace": "true", >>>> "zeppelin.spark.useHiveContext": "true", >>>> "zeppelin.interpreter.localRepo": >>>> "/home/pradeep.x.alla/zeppelin/local-repo/2BUTFVN89", >>>> "zeppelin.spark.concurrentSQL": "false", >>>> "args": "", >>>> "zeppelin.pyspark.python": "python", >>>> "spark.yarn.keytab": "/home/pradeep.x.alla/pradeep. >>>> x.alla.keytab", >>>> "spark.yarn.principal": "pradeep.x.alla", >>>> "zeppelin.dep.additionalRemoteRepository": "spark-packages, >>>> http://dl.bintray.com/spark-packages/maven,false;" >>>> }, >>>> "status": "READY", >>>> "interpreterGroup": [ >>>> { >>>> "name": "spark", >>>> "class": "org.apache.zeppelin.spark.SparkInterpreter", >>>> "defaultInterpreter": true >>>> }, >>>> { >>>> "name": "sql", >>>> "class": "org.apache.zeppelin.spark.SparkSqlInterpreter", >>>> "defaultInterpreter": false >>>> }, >>>> { >>>> "name": "dep", >>>> "class": "org.apache.zeppelin.spark.DepInterpreter", >>>> "defaultInterpreter": false >>>> }, >>>> { >>>> "name": "pyspark", >>>> "class": "org.apache.zeppelin.spark.PySparkInterpreter", >>>> "defaultInterpreter": false >>>> } >>>> ], >>>> "dependencies": [], >>>> "option": { >>>> "remote": true, >>>> "port": -1, >>>> "perNoteSession": false, >>>> "perNoteProcess": false, >>>> "isExistingProcess": false, >>>> "setPermission": false, >>>> "users": [] >>>> } >>>> } >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Aug 30, 2016 at 6:52 PM, Jeff Zhang <zjf...@gmail.com> wrote: >>>> >>>>> It looks like your kerberos configuration issue. Do you mind to share >>>>> your configuration ? Or you can first try to run spark-shell using >>>>> spark.yarn.keytab & spark.yarn.principle to verify them. >>>>> >>>>> On Wed, Aug 31, 2016 at 6:12 AM, Pradeep Reddy < >>>>> pradeepreddy.a...@gmail.com> wrote: >>>>> >>>>>> Hi- I recently built zeppelin from source and configured kerberos >>>>>> authentication. For Kerberos I added "spark.yarn.keytab" & >>>>>> "spark.yarn.principal" and also set master to "yarn-client". But I keep >>>>>> getting this error whenever I use spark interpreter in the notebook >>>>>> >>>>>> 3536728 started by scheduler org.apache.zeppelin.spark.Spar >>>>>> kInterpreter335845091 >>>>>> ERROR [2016-08-30 17:45:37,237] ({pool-2-thread-2} Job.java[run]:189) >>>>>> - Job failed >>>>>> java.lang.IllegalArgumentException: Invalid rule: L >>>>>> RULE:[2:$1@$0](.*@\Q<DOMAIN1>.COM\E$)s/@\Q<DOMAIN1>\E$//L >>>>>> RULE:[1:$1@$0](.*@\Q<DOMAIN2>\E$)s/@\Q<DOMAIN2>\E$//L >>>>>> RULE:[2:$1@$0](.*@\Q<DOMAIN2>\E$)s/@\Q<DOMAIN2>\E$//L >>>>>> DEFAULT >>>>>> at org.apache.hadoop.security.aut >>>>>> hentication.util.KerberosName.parseRules(KerberosName.java:321) >>>>>> at org.apache.hadoop.security.aut >>>>>> hentication.util.KerberosName.setRules(KerberosName.java:386) >>>>>> at org.apache.hadoop.security.Had >>>>>> oopKerberosName.setConfiguration(HadoopKerberosName.java:75) >>>>>> at org.apache.hadoop.security.Use >>>>>> rGroupInformation.initialize(UserGroupInformation.java:227) >>>>>> at org.apache.hadoop.security.Use >>>>>> rGroupInformation.ensureInitialized(UserGroupInformation.java:214) >>>>>> at org.apache.hadoop.security.Use >>>>>> rGroupInformation.isAuthenticationMethodEnabled(UserGroupInf >>>>>> ormation.java:275) >>>>>> at org.apache.hadoop.security.Use >>>>>> rGroupInformation.isSecurityEnabled(UserGroupInformation.java:269) >>>>>> at org.apache.hadoop.security.Use >>>>>> rGroupInformation.loginUserFromKeytab(UserGroupInformation.java:820) >>>>>> at org.apache.zeppelin.spark.Spar >>>>>> kInterpreter.open(SparkInterpreter.java:539) >>>>>> at org.apache.zeppelin.interprete >>>>>> r.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69) >>>>>> at org.apache.zeppelin.interprete >>>>>> r.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) >>>>>> at org.apache.zeppelin.interprete >>>>>> r.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteI >>>>>> nterpreterServer.java:383) >>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:176) >>>>>> at org.apache.zeppelin.scheduler. >>>>>> FIFOScheduler$1.run(FIFOScheduler.java:139) >>>>>> at java.util.concurrent.Executors >>>>>> $RunnableAdapter.call(Executors.java:511) >>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>>>>> at java.util.concurrent.Scheduled >>>>>> ThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledT >>>>>> hreadPoolExecutor.java:180) >>>>>> at java.util.concurrent.Scheduled >>>>>> ThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPo >>>>>> olExecutor.java:293) >>>>>> at java.util.concurrent.ThreadPoo >>>>>> lExecutor.runWorker(ThreadPoolExecutor.java:1142) >>>>>> at java.util.concurrent.ThreadPoo >>>>>> lExecutor$Worker.run(ThreadPoolExecutor.java:617) >>>>>> at java.lang.Thread.run(Thread.java:745) >>>>>> INFO [2016-08-30 17:45:37,247] ({pool-2-thread-2} >>>>>> SchedulerFactory.java[jobFinished]:137) - Job >>>>>> remoteInterpretJob_1472593536728 finished by scheduler >>>>>> org.apache.zeppelin.spark.SparkInterpreter335845091 >>>>>> >>>>>> Thanks, >>>>>> Pradeep >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards >>>>> >>>>> Jeff Zhang >>>>> >>>> >>>> >>> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> > >