Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

Ted Yu Fri, 06 Nov 2015 14:54:40 -0800

I would suggest adding a config parameter that allows bypassing
initialization of HiveContext in case of SQLException


Cheers

On Fri, Nov 6, 2015 at 2:50 PM, Zhan Zhang <zzh...@hortonworks.com> wrote:

> Hi Jerry,
>
> OK. Here is an ugly walk around.
>
> Put a hive-site.xml under $SPARK_HOME/conf with invalid content. You will
> get a bunch of exceptions because hive context initialization failure, but
> you can initialize your SQLContext on your own.
>
> scala>  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> sqlContext: org.apache.spark.sql.SQLContext =
> org.apache.spark.sql.SQLContext@4a5cc2e8
>
> scala> import sqlContext.implicits._
> import sqlContext.implicits._
>
>
> for example
>
> HW11188:spark zzhang$ more conf/hive-site.xml
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>  <configuration>
>
>    <property>
>
>       <name>hive.metastore.uris</name>
>     <value>thrift://zzhang-yarn11:9083</value>
>
>    </property>
>
>  </configuration>
> HW11188:spark zzhang$
>
> By the way, I don’t know whether there is any caveat for this walk around.
>
> Thanks.
>
> Zhan Zhang
>
>
>
>
>
> On Nov 6, 2015, at 2:40 PM, Jerry Lam <chiling...@gmail.com> wrote:
>
> Hi Zhan,
>
> I don’t use HiveContext features at all. I use mostly DataFrame API. It is
> sexier and much less typo. :)
> Also, HiveContext requires metastore database setup (derby by default).
> The problem is that I cannot have 2 spark-shell sessions running at the
> same time in the same host (e.g. /home/jerry directory). It will give me an
> exception like below.
>
> Since I don’t use HiveContext, I don’t see the need to maintain a
> database.
>
> What is interesting is that pyspark shell is able to start more than 1
> session at the same time. I wonder what pyspark has done better than
> spark-shell?
>
> Best Regards,
>
> Jerry
>
> On Nov 6, 2015, at 5:28 PM, Zhan Zhang <zzh...@hortonworks.com> wrote:
>
> If you assembly jar have hive jar included, the HiveContext will be used.
> Typically, HiveContext has more functionality than SQLContext. In what case
> you have to use SQLContext that cannot be done by HiveContext?
>
> Thanks.
>
> Zhan Zhang
>
> On Nov 6, 2015, at 10:43 AM, Jerry Lam <chiling...@gmail.com> wrote:
>
> What is interesting is that pyspark shell works fine with multiple session
> in the same host even though multiple HiveContext has been created. What
> does pyspark does differently in terms of starting up the shell?
>
> On Nov 6, 2015, at 12:12 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
> In SQLContext.scala :
>     // After we have populated SQLConf, we call setConf to populate other
> confs in the subclass
>     // (e.g. hiveconf in HiveContext).
>     properties.foreach {
>       case (key, value) => setConf(key, value)
>     }
>
> I don't see config of skipping the above call.
>
> FYI
>
> On Fri, Nov 6, 2015 at 8:53 AM, Jerry Lam <chiling...@gmail.com> wrote:
>
>> Hi spark users and developers,
>>
>> Is it possible to disable HiveContext from being instantiated when using
>> spark-shell? I got the following errors when I have more than one session
>> starts. Since I don't use HiveContext, it would be great if I can have more
>> than 1 spark-shell start at the same time.
>>
>> Exception in thread "main" java.lang.RuntimeException:
>> java.lang.RuntimeException: Unable to instantiate
>> org.apache.hadoop.hive.ql.metadata.SessionHiveMetaS
>> toreClient
>>         at
>> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
>>         at
>> org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:171)
>>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>         at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>         at
>> org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183)
>>         at
>> org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179)
>>         at
>> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:226)
>>         at
>> org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:185)
>>         at
>> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:392)
>>         at
>> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:235)
>>         at
>> org.apache.spark.sql.SQLContext$$anonfun$5.apply(SQLContext.scala:234)
>>         at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>         at
>> scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>>         at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>>         at org.apache.spark.sql.SQLContext.<init>(SQLContext.scala:234)
>>         at
>> org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:72)
>>         at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>> Method)
>>         at
>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>>         at
>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>         at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>>         at
>> org.apache.spark.repl.SparkILoop.createSQLContext(SparkILoop.scala:1028)
>>         at
>> org.apache.spark.repl.SparkILoopExt.importSpark(SparkILoopExt.scala:154)
>>         at
>> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply$mcZ$sp(SparkILoopExt.scala:127)
>>         at
>> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
>>         at
>> org.apache.spark.repl.SparkILoopExt$$anonfun$process$1.apply(SparkILoopExt.scala:113)
>>
>> Best Regards,
>>
>> Jerry
>>
>
>
>
>
>
>

Re: [Spark-SQL]: Disable HiveContext from instantiating in spark-shell

Reply via email to