okay what is difference between keep set hive.execution.engine =spark and running the script through hivecontext.sql
Show quoted text On Mar 9, 2017 8:52 AM, "ayan guha" <guha.a...@gmail.com> wrote: > Hi > > Subject to your version of Hive & Spark, you may want to set > hive.execution.engine=spark as beeline command line parameter, assuming you > are running hive scripts using beeline command line (which is suggested > practice for security purposes). > > > > On Thu, Mar 9, 2017 at 2:09 PM, nancy henry <nancyhenry6...@gmail.com> > wrote: > >> >> Hi Team, >> >> basically we have all data as hive tables ..and processing it till now in >> hive on MR.. now that we have hivecontext which can run hivequeries on >> spark, we are making all these complex hive scripts to run using >> hivecontext.sql(sc.textfile(hivescript)) kind of approach ie basically >> running hive queries on spark and not coding anything yet in scala still we >> see just making hive queries to run on spark is showing a lot difference in >> time than run on MR.. >> >> so as we already have hivescripts lets make those complex hivescript run >> using hc.sql as hc.sql is able to do it >> >> or is this not best practice even though spark can do it its still better >> to load all those individual hive tables in spark and make rdds and write >> scala code to get the same functionality happening in hive >> >> its becoming difficult for us to choose whether to leave it to hc.sql to >> do the work of running complex scripts also or we have to code in >> scala..will it be worth the effort of manual intervention in terms of >> performance >> >> ex of our sample scripts >> use db; >> create tempfunction1 as com.fgh.jkl.TestFunction; >> >> create destable in hive; >> insert overwrite desttable select (big complext transformations and usage >> of hive udf) >> from table1,table2,table3 join table4 on some condition complex and join >> table 7 on another complex condition where complex filtering >> >> So please help what would be best approach and why i should not give >> entire script for hivecontext to make its own rdds and run on spark if we >> are able to do it >> >> coz all examples i see online are only showing hc.sql("select * from >> table1) and nothing complex than that >> >> >> > > > -- > Best Regards, > Ayan Guha >