Yes, that is what I meant. In practice, it is often not possible to reach a 100% perfectly denormalized fact table (for instance, if you need type 1 dimension behavior).
PS: We're using 'denormalize' a bit loosely here since we are comparing to a star schema and not a 3NF schema. The proper term here, arguably, is to `degenerate the dimension`. https://en.wikipedia.org/wiki/Degenerate_dimension [http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726] Grant Overby Software Engineer Cisco.com<http://www.cisco.com/> grove...@cisco.com<mailto:grove...@cisco.com> Mobile: 865 724 4910 [http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information. From: Ashok Kumar <ashok34...@yahoo.com<mailto:ashok34...@yahoo.com>> Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>, Ashok Kumar <ashok34...@yahoo.com<mailto:ashok34...@yahoo.com>> Date: Friday, December 18, 2015 at 4:22 PM To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>> Subject: Re: The advantages of Hive/Hadoop comnpared to Data Warehouse Thank you sir. Can you please describe a bit more detail your vision of "A fully denormalized columnar store"? Are you referring to get rid of star schema altogether in Hive and replace it with ORC tables? Regards On Friday, 18 December 2015, 21:13, Grant Overby (groverby) <grove...@cisco.com<mailto:grove...@cisco.com>> wrote: You forgot horizontal scaling. A fully denormalized columnar store in Hive will out preform a star schema in Oracle in every way imaginable at scale; however, if your data isn't big enough then this is a moot point. If your data fits in a traditional BI warehouse, and especially if it does so without people constantly complaining about performance or retention issues, your data may simply not be large enough to warrant a Hadoop stack. How many rows are in your largest fact table? [X] Grant Overby Software Engineer Cisco.com<http://www.cisco.com/> grove...@cisco.com<mailto:grove...@cisco.com> Mobile: 865 724 4910 [X] Think before you print. This email may contain confidential and privileged material for the sole use of the intended recipient. Any review, use, distribution or disclosure by others is strictly prohibited. If you are not the intended recipient (or authorized to receive for the recipient), please contact the sender by reply email and delete all copies of this message. Please click here<http://www.cisco.com/web/about/doing_business/legal/cri/index.html> for Company Registration Information. From: Ashok Kumar <ashok34...@yahoo.com<mailto:ashok34...@yahoo.com>> Reply-To: User <user@hive.apache.org<mailto:user@hive.apache.org>>, Ashok Kumar <ashok34...@yahoo.com<mailto:ashok34...@yahoo.com>> Date: Friday, December 18, 2015 at 4:01 PM To: User <user@hive.apache.org<mailto:user@hive.apache.org>> Subject: The advantages of Hive/Hadoop comnpared to Data Warehouse Gurus, Some analysts keep asking me the advantages of having Hive tables when the star schema in Data Warehouse (DW) does the same. For example if you have fact and dimensions table in DW and just import them into Hive via a say SQOOP, what are we going to gain. I keep telling them storage economy and cheap disks, de-normalisation can be done further etc. However, they are not convinced :( Any additional comments will help my case. Thanks a lot