Hi Michael,
If I understand correctly, the assembly JAR file is deployed onto HDFS 
/user/$USER/.stagingSpark folders that will be used by all computing (worker) 
nodes when people run in yarn-cluster mode.
Could you elaborate more what does the document mean by this? It is a bit 
misleading and I guess this only applies to standalone mode?
Andrew L

Date: Fri, 25 Jul 2014 15:25:42 -0700
Subject: RE: Spark SQL and Hive tables
From: ssti...@live.com
To: user@spark.apache.org






Thanks!  Will do.







Sent via the Samsung GALAXY S®4, an AT&T 4G LTE smartphone





-------- Original message --------
From: Michael Armbrust 
Date:07/25/2014 3:24 PM (GMT-08:00) 
To: user@spark.apache.org 
Subject: Re: Spark SQL and Hive tables 






[S]ince Hive has a large number of dependencies, it is not included in the 
default Spark assembly. In order to use Hive
 you must first run ‘SPARK_HIVE=true sbt/sbt assembly/assembly’
 (or use -Phive for
 maven). This command builds a new assembly jar that includes Hive. Note that 
this Hive assembly jar must also be present on all of the worker nodes, as they 
will need access to the Hive serialization and deserialization libraries 
(SerDes) in order to acccess
 data stored in Hive.





On Fri, Jul 25, 2014 at 3:20 PM, Sameer Tilak 
<ssti...@live.com> wrote:



Hi Jerry,




I am having trouble with this. May be something wrong with my import or version 
etc. 



scala> import org.apache.spark.sql._;
import org.apache.spark.sql._



scala> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
<console>:24: error: object hive is not a member of package org.apache.spark.sql
       val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
                                                  ^
Here is what I see for autocompletion:



scala> org.apache.spark.sql.
Row             SQLContext      SchemaRDD       SchemaRDDLike   api
catalyst        columnar        execution       package         parquet
test







Date: Fri, 25 Jul 2014 17:48:27 -0400


Subject: Re: Spark SQL and Hive tables


From: chiling...@gmail.com

To: user@spark.apache.org





Hi Sameer,



The blog post you referred to is about Spark SQL. I don't think the intent of 
the article is meant to guide you how to read data from Hive via Spark SQL. So 
don't worry too much about the blog post. 



The programming guide I referred to demonstrate how to read data from Hive 
using Spark SQL. It is a good starting point.



Best Regards,



Jerry





On Fri, Jul 25, 2014 at 5:38 PM, Sameer Tilak <ssti...@live.com> wrote:



Hi Michael,
Thanks. I am not creating HiveContext, I am creating SQLContext. I am using CDH 
5.1. Can you please let me know which conf/ directory you are talking about? 





From: mich...@databricks.com

Date: Fri, 25 Jul 2014 14:34:53 -0700


Subject: Re: Spark SQL and Hive tables


To: user@spark.apache.org





In particular, have you put your hive-site.xml in the conf/ directory?  Also, 
are you creating a HiveContext instead of a SQLContext?




On Fri, Jul 25, 2014 at 2:27 PM, Jerry Lam <chiling...@gmail.com> wrote:


Hi Sameer,



Maybe this page will help you: 
https://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables



Best Regards,



Jerry










On Fri, Jul 25, 2014 at 5:25 PM, Sameer Tilak <ssti...@live.com> wrote:




Hi All,
I am trying to load data from Hive tables using Spark SQL. I am using 
spark-shell. Here is what I see: 



val trainingDataTable = sql("""SELECT prod.prod_num, demographics.gender, 
demographics.birth_year, demographics.income_group  FROM prod p JOIN 
demographics d ON d.user_id = p.user_id""")



14/07/25 14:18:46 INFO Analyzer: Max iterations (2) reached for batch 
MultiInstanceRelations
14/07/25 14:18:46 INFO Analyzer: Max iterations (2) reached for batch 
CaseInsensitiveAttributeReferences
java.lang.RuntimeException: Table Not Found: prod.



I have these tables in hive. I used show tables command to confirm this. Can 
someone please let me know how do I make them accessible here? 




































                                          

Reply via email to