This depends on which output format you want. For Parquet, you can
simply do this:
|hiveContext.table("some_db.some_table").saveAsParquetFile("hdfs://path/to/file")
|
On 12/23/14 5:22 PM, LinQili wrote:
Hi Leo:
Thanks for your reply.
I am talking about using hive from spark to export data from hive to hdfs.
maybe like:
val exportData = s"insert overwrite directory
'/user/linqili/tmp/src' select * from $DB.$tableName"
hiveContext.sql(exportData)
but it was unsupported in spark now:
Exception in thread "Thread-3" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
org.apache.spark.deploy.yarn.ApplicationMaster$anon$2.run(ApplicationMaster.scala:183)
Caused by: java.lang.RuntimeException:
Unsupported language features in query: insert overwrite directory
'/user/linqili/tmp/src' select * from test_spark.src
TOK_QUERY
TOK_FROM
TOK_TABREF
TOK_TABNAME
test_spark
src
TOK_INSERT
TOK_DESTINATION
TOK_DIR
'/user/linqili/tmp/src'
TOK_SELECT
TOK_SELEXPR
TOK_ALLCOLREF
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.hive.HiveQl$.parseSql(HiveQl.scala:256)
at org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:106)
at org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:110)
at com.nd.huayuedu.HiveExportTest$.main(HiveExportTest.scala:35)
at com.nd.huayuedu.HiveExportTest.main(HiveExportTest.scala)
... 5 more
------------------------------------------------------------------------
Date: Tue, 23 Dec 2014 16:47:11 +0800
From: leo.chen.cip...@outlook.com
To: lin_q...@outlook.com
Subject: Re: How to export data from hive into hdfs in spark program?
Hi,
If you are talking about using spark's thriftserver, this query should
work:
export table $DB.$tableName to '/user/linqili/tmp/src';
However you need to take care of that folder (by deleting it I
presume) first.
Cheers,
Leo
On 2014/12/23 16:09, LinQili wrote:
Hi all:
I wonder if is there a way to export data from table of hive into
hdfs using spark?
like this: INSERT OVERWRITE DIRECTORY '/user/linqili/tmp/src'
select * from $DB.$tableName