We have a similar use case. We use the DataFrame API to cache data out of
Hive tables, and then run pretty complex scripts on them. You can register
your Hive UDFs to be used within Spark SQL statements if you want.
Something like this:
sqlContext.sql("CREATE TEMPORARY FUNCTION as ''")
If you h
okay what is difference between keep set hive.execution.engine =spark
and
running the script through hivecontext.sql
Show quoted text
On Mar 9, 2017 8:52 AM, "ayan guha" wrote:
> Hi
>
> Subject to your version of Hive & Spark, you may want to set
> hive.execution.engine=spark as beeline comman
Hi
Subject to your version of Hive & Spark, you may want to set
hive.execution.engine=spark as beeline command line parameter, assuming you
are running hive scripts using beeline command line (which is suggested
practice for security purposes).
On Thu, Mar 9, 2017 at 2:09 PM, nancy henry
wrote
Hi Team,
basically we have all data as hive tables ..and processing it till now in
hive on MR.. now that we have hivecontext which can run hivequeries on
spark, we are making all these complex hive scripts to run using
hivecontext.sql(sc.textfile(hivescript)) kind of approach ie basically
running
Hi Team,
basically we have all data as hive tables ..and processing it till now in
hive on MR.. now that we have hivecontext which can run hivequeries on
spark, we are making all these complex hive scripts to run using
hivecontext.sql(sc.textfile(hivescript)) kind of approach ie basically
running