Re: Hive on Spark performance

2016-03-14 Thread sjayatheertha
Thanks for your response. We were evaluating Spark and were curious to know how it is used today and the lowest latency it can provide. > On Mar 14, 2016, at 8:37 AM, Mich Talebzadeh > wrote: > > Hi Wlodeck, > > Let us look at this. > > In Oracle I have two tables channels and sales. This c

Re: Hive on Spark performance

2016-03-14 Thread Mich Talebzadeh
Hi Wlodeck, Let us look at this. In Oracle I have two tables channels and sales. This code works in Oracle 1 select c.channel_id, sum(c.channel_id * (select count(1) from sales s WHERE c.channel_id = s.channel_id)) As R 2 from channels c 3* group by c.channel_id s...@mydb.mich.LOCAL> / C

Re: Hive on Spark performance

2016-03-14 Thread ws
Hive 1.2.1.2.3.4.0-3485Spark 1.5.2Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production ### SELECT  f.description, f.item_number, sum(f.df_a * (select count(1) from e.mv_A_h_a where hb_h_name = r.h_id)) as df_aFROM e.eng_fac_atl_sc_bf_qty f, wv_ATL_2_qty_df_rates rwhere f.

Re: Hive on Spark performance

2016-03-13 Thread Mich Talebzadeh
Depending on the version of Hive on Spark engine. As far as I am aware the latest version of Hive that I am using (Hive 2) has improvements compared to the previous versions of Hive (0.14,1.2.1) on Spark engine. As of today I have managed to use Hive 2.0 on Spark version 1.3.1. So it is not the l

Hive on Spark performance

2016-03-13 Thread sjayatheertha
Just curious if you could share your experience on the performance of spark in your company? How much data do you process? And what's the latency you are getting with spark engine? Vidya