Re: Hive and Impala

Mich Talebzadeh Tue, 01 Mar 2016 03:39:51 -0800

Just to clarify the statement in quotes was made by the author of the
article


"We can access all objects from Hive data warehouse with HiveQL which
leverages the map-reduce architecture in background for data retrieval and
transformation and this results in latency."

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 1 March 2016 at 11:33, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:

> I have not heard of Impala anymore. I saw an article in LinkedIn titled
>
> "Apache Hive Or Cloudera Impala? What is Best for me?"
>
> "We can access all objects from Hive data warehouse with HiveQL which
> leverages the map-reduce architecture in background for data retrieval and
> transformation and this results in latency."
>
> My response was
>
> This statement is no longer valid as you have choices of three engines now
> with MR, Spark and Tez. I have not used Impala myself as I don't think
> there is a need for it with Hive on Spark or Spark using Hive metastore
> providing whatever needed. Hive is for Data Warehouse and provides what is
> says on the tin. Please also bear in mind that Hive offers ORC storage
> files that provide store Index capabilities further optimizing the queries
> with additional stats at file, stripe and row group levels.
>
> Anyway the question is with Hive on Spark or Spark using Hive metastore
> what we cannot achieve that we can achieve with Impala?
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>

Re: Hive and Impala

Reply via email to