Yes, that is what I meant.

In practice, it is often not possible to reach a 100% perfectly denormalized 
fact table (for instance, if you need type 1 dimension behavior).


PS:
We're using 'denormalize' a bit loosely here since we are comparing to a star 
schema and not a 3NF schema. The proper term here, arguably, is to `degenerate 
the dimension`. https://en.wikipedia.org/wiki/Degenerate_dimension
[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
grove...@cisco.com<mailto:grove...@cisco.com>
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you 
print.

This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.

Please click 
here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for 
Company Registration Information.





From: Ashok Kumar <ashok34...@yahoo.com<mailto:ashok34...@yahoo.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>, Ashok Kumar 
<ashok34...@yahoo.com<mailto:ashok34...@yahoo.com>>
Date: Friday, December 18, 2015 at 4:22 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
<user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: The advantages of Hive/Hadoop comnpared to Data Warehouse

Thank you sir.

Can you please describe a bit more detail your vision of "A fully denormalized 
columnar store"? Are you referring to get rid of star schema altogether in Hive 
and replace it with ORC tables?

Regards


On Friday, 18 December 2015, 21:13, Grant Overby (groverby) 
<grove...@cisco.com<mailto:grove...@cisco.com>> wrote:


You forgot horizontal scaling.

A fully denormalized columnar store in Hive will out preform a star schema in 
Oracle in every way imaginable at scale; however, if your data isn't big enough 
then this is a moot point.

If your data fits in a traditional BI warehouse, and especially if it does so 
without people constantly complaining about performance or retention issues, 
your data may simply not be large enough to warrant a Hadoop stack.

How many rows are in your largest fact table?
[X]
Grant Overby
Software Engineer
Cisco.com<http://www.cisco.com/>
grove...@cisco.com<mailto:grove...@cisco.com>
Mobile: 865 724 4910





[X] Think before you print.
This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.
Please click 
here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for 
Company Registration Information.





From: Ashok Kumar <ashok34...@yahoo.com<mailto:ashok34...@yahoo.com>>
Reply-To: User <user@hive.apache.org<mailto:user@hive.apache.org>>, Ashok Kumar 
<ashok34...@yahoo.com<mailto:ashok34...@yahoo.com>>
Date: Friday, December 18, 2015 at 4:01 PM
To: User <user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: The advantages of Hive/Hadoop comnpared to Data Warehouse

Gurus,

Some analysts keep asking me the advantages of having Hive tables when the star 
schema in Data Warehouse (DW) does the same.

For example if you have fact and dimensions table in DW and just import them 
into Hive via a say SQOOP, what are we going to gain.

I keep telling them storage economy and cheap disks, de-normalisation can be 
done further etc. However, they are not convinced :(

Any additional comments will help my case.

Thanks a lot


Reply via email to