Re: Pointing SparkSQL to existing Hive Metadata with data file locations in HDFS

ayan guha Wed, 27 May 2015 20:55:42 -0700

Yes, you are at right path. Only thing to remember is placing hive site XML
to correct path so spark can talk to hive metastore.


Best
Ayan
On 28 May 2015 10:53, "Sanjay Subramanian"
<sanjaysubraman...@yahoo.com.invalid> wrote:

> hey guys
>
> On the Hive/Hadoop ecosystem we have using Cloudera distribution CDH 5.2.x
> , there are about 300+ hive tables.
> The data is stored an text (moving slowly to Parquet) on HDFS.
> I want to use SparkSQL and point to the Hive metadata and be able to
> define JOINS etc using a programming structure like this
>
> import org.apache.spark.sql.hive.HiveContext
> val sqlContext = new HiveContext(sc)
> val schemaRdd = sqlContext.sql("some complex SQL")
>
>
> Is that the way to go ? Some guidance will be great.
>
> thanks
>
> sanjay
>
>
>
>

Re: Pointing SparkSQL to existing Hive Metadata with data file locations in HDFS

Reply via email to