You might need hive-site.xml
_____________________________
From: Peter Zhang <[email protected]>
Sent: Monday, January 18, 2016 9:08 PM
Subject: Re: SparkR with Hive integration
To: Jeff Zhang <[email protected]>
Cc: <[email protected]>
Thanks,
I will try.
Peter
--
Google
Sent with Airmail
On January 19, 2016 at 12:44:46, Jeff Zhang ([email protected]) wrote:
Please make sure you export environment variable
HADOOP_CONF_DIR which contains the core-site.xml
On Mon, Jan 18, 2016 at 8:23 PM, Peter Zhang
<[email protected]> wrote:
Hi all,
http://spark.apache.org/docs/latest/sparkr.html#sparkr-dataframes
From Hive tables
You can also create SparkR DataFrames from Hive tables. To do this we will
need to create a HiveContext which can access tables in the Hive MetaStore.
Note that Spark should have been built with Hive support and more details on
the difference between SQLContext and HiveContext can be found in the SQL
programming guide. # sc is an existing
SparkContext.hiveContext <- sparkRHive.init(sc)sql(hiveContext, "CREATE TABLE
IF NOT EXISTS src (key INT, value STRING)")sql(hiveContext, "LOAD DATA LOCAL
INPATH 'examples/src/main/resources/kv1.txt' INTO TABLE src")# Queries can be
expressed in HiveQL.results <- sql(hiveContext, "FROM src SELECT key, value")#
results is now a DataFramehead(results)## key value## 1 238 val_238## 2 86
val_86## 3 311 val_311
I use RStudio to run above command, when I run " sql
( hiveContext , "CREATE TABLE IF NOT EXISTS src
(key INT, value STRING)” )”
I got exception: 16/01/19 12:11:51 INFO
FileUtils: Creating directory if it doesn't exist:
file:/user/hive/warehouse/src 16/01/19 12:11:51 ERROR DDLTask:
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:
file:/user/hive/warehouse/src is not a directory or unable to create one)
How to use HDFS instead of local file
system(file)? Which parameter should to set?
Thanks a lot.
Peter Zhang
--
Google
Sent with Airmail
--
Best Regards
Jeff Zhang