I would suggest adding a config parameter that allows bypassing initialization of HiveContext in case of SQLException
Cheers On Fri, Nov 6, 2015 at 2:50 PM, Zhan Zhang <zzh...@hortonworks.com> wrote: > Hi Jerry, > > OK. Here is an ugly walk around. > > Put a hive-site.xml under $SPARK_HOME/conf with invalid content. You will > get a bunch of exceptions because hive context initialization failure, but > you can initialize your SQLContext on your own. > > scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc) > sqlContext: org.apache.spark.sql.SQLContext = > org.apache.spark.sql.SQLContext@4a5cc2e8 > > scala> import sqlContext.implicits._ > import sqlContext.implicits._ > > > for example > > HW11188:spark zzhang$ more conf/hive-site.xml > <?xml version="1.0"?> > <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> > <configuration> > > <property> > > <name>hive.metastore.uris</name> > <value>thrift://zzhang-yarn11:9083</value> > > </property> > > </configuration> > HW11188:spark zzhang$ > > By the way, I don’t know whether there is any caveat for this walk around. > > Thanks. > > Zhan Zhang > > > > > > On Nov 6, 2015, at 2:40 PM, Jerry Lam <chiling...@gmail.com> wrote: > > Hi Zhan, > > I don’t use HiveContext features at all. I use mostly DataFrame API. It is > sexier and much less typo. :) > Also, HiveContext requires metastore database setup (derby by default). > The problem is that I cannot have 2 spark-shell sessions running at the > same time in the same host (e.g. /home/jerry directory). It will give me an > exception like below. > > Since I don’t use HiveContext, I don’t see the need to maintain a > database. > > What is interesting is that pyspark shell is able to start more than 1 > session at the same time. I wonder what pyspark has done better than > spark-shell? > > Best Regards, > > Jerry > > On Nov 6, 2015, at 5:28 PM, Zhan Zhang <zzh...@hortonworks.com> wrote: > > If you assembly jar have hive jar included, the HiveContext will be used. > Typically, HiveContext has more functionality than SQLContext. In what case > you have to use SQLContext that cannot be done by HiveContext? > > Thanks. > > Zhan Zhang > > On Nov 6, 2015, at 10:43 AM, Jerry Lam <chiling...@gmail.com> wrote: > > What is interesting is that pyspark shell works fine with multiple session > in the same host even though multiple HiveContext has been created. What > does pyspark does differently in terms of starting up the shell? > > On Nov 6, 2015, at 12:12 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > In SQLContext.scala : > // After we have populated SQLConf, we call setConf to populate other > confs in the subclass > // (e.g. hiveconf in HiveContext). > properties.foreach { > case (key, value) => setConf(key, value) > } > > I don't see config of skipping the above call. > > FYI > > On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam <chiling...@gmail.com> wrote: > >> Hi spark users and developers, >> >> Is it possible to disable HiveContext from being instantiated when using >> spark-shell? I got the following errors when I have more than one session >> starts. Since I don't use HiveContext, it would be great if I can have more >> than 1 spark-shell start at the same time. >> >> Exception in thread "main" java.lang.RuntimeException: >> java.lang.RuntimeException: Unable to instantiate >> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS >> toreClient >> at >> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) >> at >> org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171) >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >> Method) >> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >> at >> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183) >> at >> org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179) >> at >> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226) >> at >> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185) >> at >> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392) >> at >> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235) >> at >> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234) >> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> at >> scala.collection.IterableLike$class.foreach(IterableLike.scala:72) >> at scala.collection.AbstractIterable.foreach(Iterable.scala:54) >> at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:234) >> at >> org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:72) >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >> Method) >> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >> at >> org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028) >> at >> org.apache.spark.repl.SparkILoopExt.importSpark(SparkILoopExt.scala:154) >> at >> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply$mcZ$sp(SparkILoopExt.scala:127) >> at >> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113) >> at >> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113) >> >> Best Regards, >> >> Jerry >> > > > > > >