RE: Pointing SparkSQL to existing Hive Metadata with data file locations in HDFS

Cheng, Hao Wed, 27 May 2015 20:20:07 -0700

Yes, but be sure you put the hive-site.xml under your class path.

Any problem you meet?

Cheng Hao

From: Sanjay Subramanian [mailto:sanjaysubraman...@yahoo.com.INVALID]
Sent: Thursday, May 28, 2015 8:53 AM
To: user
Subject: Pointing SparkSQL to existing Hive Metadata with data file locations 
in HDFS

hey guys

On the Hive/Hadoop ecosystem we have using Cloudera distribution CDH 5.2.x , 
there are about 300+ hive tables.
The data is stored an text (moving slowly to Parquet) on HDFS.
I want to use SparkSQL and point to the Hive metadata and be able to define 
JOINS etc using a programming structure like this

import org.apache.spark.sql.hive.HiveContext
val sqlContext = new HiveContext(sc)
val schemaRdd = sqlContext.sql("some complex SQL")

Is that the way to go ? Some guidance will be great.

thanks

sanjay

RE: Pointing SparkSQL to existing Hive Metadata with data file locations in HDFS

Reply via email to