1. Don't create sqlContext in zeppelin as zeppelin will create that for
you, and %sql use the sqlContext created by zeppelin itself.
2. Make sure you have hive-site.xml under SPARK_CONF_DIR if you want to use
hiveContext. Otherwise spark will use single user derby instance which is
not for production, and will cause conflicts when you create multiple spark
interpreter in one zeppelin instance.


LINZ, Arnaud <al...@bouyguestelecom.fr>于2018年2月5日周一 下午8:33写道:

> Hello,
>
>
>
> I’m trying to install Zeppelin (0.7.2) on my CDH cluster, and I am unable
> to connect the sql + graphical representations of the %sql  interpreter
> with my Hive data, and more surprisingly I really can’t find any good
> source on the internet (apache zeppelin documentation or stack overflow)
> that gives a practical answer about how to do this.
>
> Most of the time, the data comes from compressed Hive tables and not plain
> hdfs text files ; so using a hive context is far more convenient than a
> plain spark sql context.
>
>
>
> The following :
>
> %spark
>
> val hc = new  org.apache.spark.sql.hive.HiveContext(sc)
>
> val result = hc.sql("select * from hivedb.hivetable")
>
> result.registerTempTable("myTest")
>
>
>
> works but no myTest table is available in the following %sql interpreter :
>
> %sql
>
> select * from myTest
>
> org.apache.spark.sql.AnalysisException: Table not found: myTest;
>
>
>
>
>
> However the following :
>
> %pyspark
>
> result = sqlContext.read.text("hdfs://cluster/test.txt")
>
> result.registerTempTable("mySqlTest")
>
>
>
> works as the %sql interpreter is “plugged”  to the sqlContext
>
>
>
> but
>
> result = sqlContext.sql("select * from hivedb.hivetable") does not work
> as the sqlContext is not a hive context.
>
>
>
> I have set zeppelin.spark.useHiveContext to true, but it seems to have no
> effect (btw, it was more of a wild guess since the documentation is not
> giving much detail on parameters and context configuration)
>
>
>
> Can you direct me towards how to configure the context used by the %sql
> interpreter?
>
>
>
> Best regards,
>
> Arnaud
>
>
>
> PS : %spark and %sql interpreter conf:
>
>
>
> master  yarn-client
>
> spark.app.name  Zeppelin
>
> spark.cores.max
>
> spark.executor.memory   5g
>
> zeppelin.R.cmd  R
>
> zeppelin.R.image.width  100%
>
> zeppelin.R.knitr    true
>
> zeppelin.R.render.options   out.format = 'html', comment = NA, echo =
> FALSE, results = 'asis', message = F, warning = F
>
> zeppelin.dep.additionalRemoteRepository spark-packages,
> http://dl.bintray.com/spark-packages/maven,false;
>
> zeppelin.dep.localrepo  local-repo
>
> zeppelin.interpreter.localRepo  /opt/zeppelin/local-repo/2CYVF45A9
>
> zeppelin.interpreter.output.limit   102400
>
> zeppelin.pyspark.python /usr/bin/pyspark
>
> zeppelin.spark.concurrentSQL    true
>
> zeppelin.spark.importImplicit   true
>
> zeppelin.spark.maxResult    1000
>
> zeppelin.spark.printREPLOutput  true
>
> zeppelin.spark.sql.stacktrace   true
>
> zeppelin.spark.useHiveContext   true
>
> ------------------------------
>
> L'intégrité de ce message n'étant pas assurée sur internet, la société
> expéditrice ne peut être tenue responsable de son contenu ni de ses pièces
> jointes. Toute utilisation ou diffusion non autorisée est interdite. Si
> vous n'êtes pas destinataire de ce message, merci de le détruire et
> d'avertir l'expéditeur.
>
> The integrity of this message cannot be guaranteed on the Internet. The
> company that sent this message cannot therefore be held liable for its
> content nor attachments. Any unauthorized use or dissemination is
> prohibited. If you are not the intended recipient of this message, then
> please delete it and notify the sender.
>

Reply via email to