Re: spark-sql use case beginner question

ayan guha Wed, 08 Mar 2017 19:53:25 -0800

Hi

Subject to your version of Hive & Spark, you may want to set
hive.execution.engine=spark as beeline command line parameter, assuming you
are running hive scripts using beeline command line (which is suggested
practice for security purposes).




On Thu, Mar 9, 2017 at 2:09 PM, nancy henry <nancyhenry6...@gmail.com>
wrote:

>
> Hi Team,
>
> basically we have all data as hive tables ..and processing it till now in
> hive on MR.. now that we have hivecontext which can run hivequeries on
> spark, we are making all these complex hive scripts to run using
> hivecontext.sql(sc.textfile(hivescript)) kind of approach ie basically
> running hive queries on spark and not coding anything yet in scala still we
> see just making hive queries to run on spark is showing a lot difference in
> time than run on MR..
>
> so as we already have hivescripts lets make those complex hivescript run
> using hc.sql as hc.sql is able to do it
>
> or is this not best practice even though spark can do it its still better
> to load all those individual hive tables in spark and make rdds and write
> scala code to get the same functionality happening in hive
>
> its becoming difficult for us to choose whether to leave it to hc.sql to
> do the work of running complex scripts also or we have to code in
> scala..will it be worth the effort of manual intervention in terms of
> performance
>
> ex of our sample scripts
> use db;
> create tempfunction1 as com.fgh.jkl.TestFunction;
>
> create destable in hive;
> insert overwrite desttable select (big complext transformations and usage
> of hive udf)
> from table1,table2,table3 join table4 on some condition complex and join
> table 7 on another complex condition where complex filtering
>
> So please help what would be best approach and why i should not give
> entire script for hivecontext to make its own rdds and run on spark if we
> are able to do it
>
> coz all examples i see online are only showing hc.sql("select * from
> table1) and nothing complex than that
>
>
>


-- 
Best Regards,
Ayan Guha

Re: spark-sql use case beginner question

Reply via email to