Hive is great for massive transformations needed in ETL type processing and 
full data set analytics. Impala is better suited for fast analytical queries 
returning a tiny subset of the original data set. Both are improving in terms 
of concurrency and latency however they have a long ways to go to beat 
commercial MPP solutions in terms of performance and stability. Their key 
advantages are storage economics and flexibility (schema on read).

Sent from my iPhone

On Apr 27, 2015, at 6:27 AM, Anilkumar Kalshetti 
<anilkalshe...@gmail.com<mailto:anilkalshe...@gmail.com>> wrote:

Hi Ashok,

Also Now you can use spark as execution Engine for Hive. Please check 
HiveOnSpark[HoS] Project.

Ref 
Link<https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started>.

Thanks

On 27 April 2015 at 15:22, Fabio C. 
<anyte...@gmail.com<mailto:anyte...@gmail.com>> wrote:
If the comparison mention just MR, then is probably outdated. Hive can now run 
on Tez with a great improvement in performance.
However I don't know about Hive+Tez vs Impala.

On Mon, Apr 27, 2015 at 10:50 AM, Nitin Pawar 
<nitinpawar...@gmail.com<mailto:nitinpawar...@gmail.com>> wrote:
What use case are you trying to solve?

On Mon, Apr 27, 2015 at 2:16 PM, Ashok Kumar 
<ashok34...@yahoo.com<mailto:ashok34...@yahoo.com>> wrote:
Hi gurus,

Kindly help me understand the advantage that Impala has over Hive.

I read a note that Impala does not use MapReduce engine and is therefore very 
fast for queries compared to Hive. However, Hive as I understand is widely used 
everywhere!

Thank you




--
Nitin Pawar


Reply via email to