Adding to valuable points raised by Grant and Jörn, one can also add the
Total Cost of Ownership (TCO) of the application on Big Data compared to a
traditional warehouse like Oracle and Sybase IQ that utilize vertical
scaling in an SMP environment on expensive SAN. Although there a lot of
other abstraction layers, the storage cost is an important factor for many
companies as they are trying to archive the old data using tools like
RainStor etc at a cost of typically 0.4 US Dollar/GB/Month on cheap storage
compared to traditional RDBMS data at around 5 US Dollar/GB/Month on SAN.
This although looks a cheaper TCO, it is far from desirable as accessing
archive data when is needed is problematic and requires further investment.
I don’t think there is such issue with HDFS.

 

HTH,

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.
pdf

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15",
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN
978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN:
978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume
one out shortly

 

http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/> 

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Technology
Ltd, its subsidiaries or their employees, unless expressly so stated. It is
the responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept
any responsibility.

 

From: Grant Overby (groverby) [mailto:grove...@cisco.com] 
Sent: 18 December 2015 21:51
To: user@hive.apache.org; Ashok Kumar <ashok34...@yahoo.com>
Subject: Re: The advantages of Hive/Hadoop comnpared to Data Warehouse

 

Yes, that is what I meant.

 

In practice, it is often not possible to reach a 100% perfectly denormalized
fact table (for instance, if you need type 1 dimension behavior). 

 

 

PS:

We’re using ‘denormalize’ a bit loosely here since we are comparing to a
star schema and not a 3NF schema. The proper term here, arguably, is to
`degenerate the dimension`.
https://en.wikipedia.org/wiki/Degenerate_dimension






Grant Overby
Software Engineer
 <http://www.cisco.com/> Cisco.com
 <mailto:grove...@cisco.com> grove...@cisco.com
Mobile: 865 724 4910

                
        

 


 Think before you print.


This email may contain confidential and privileged material for the sole use
of the intended recipient. Any review, use, distribution or disclosure by
others is strictly prohibited. If you are not the intended recipient (or
authorized to receive for the recipient), please contact the sender by reply
email and delete all copies of this message.

Please  <http://www.cisco.com/web/about/doing_business/legal/cri/index.html>
click here for Company Registration Information.

 

 

From: Ashok Kumar <ashok34...@yahoo.com <mailto:ashok34...@yahoo.com> >
Reply-To: "user@hive.apache.org <mailto:user@hive.apache.org> "
<user@hive.apache.org <mailto:user@hive.apache.org> >, Ashok Kumar
<ashok34...@yahoo.com <mailto:ashok34...@yahoo.com> >
Date: Friday, December 18, 2015 at 4:22 PM
To: "user@hive.apache.org <mailto:user@hive.apache.org> "
<user@hive.apache.org <mailto:user@hive.apache.org> >
Subject: Re: The advantages of Hive/Hadoop comnpared to Data Warehouse

 

Thank you sir.

 

Can you please describe a bit more detail your vision of "A fully
denormalized columnar store"? Are you referring to get rid of star schema
altogether in Hive and replace it with ORC tables?

 

Regards

 

On Friday, 18 December 2015, 21:13, Grant Overby (groverby)
<grove...@cisco.com <mailto:grove...@cisco.com> > wrote:

 

You forgot horizontal scaling.

 

A fully denormalized columnar store in Hive will out preform a star schema
in Oracle in every way imaginable at scale; however, if your data isn’t big
enough then this is a moot point. 

 

If your data fits in a traditional BI warehouse, and especially if it does
so without people constantly complaining about performance or retention
issues, your data may simply not be large enough to warrant a Hadoop stack.

 

How many rows are in your largest fact table?



Error! Filename not specified.


Grant Overby
Software Engineer
 <http://www.cisco.com/> Cisco.com
 <mailto:grove...@cisco.com> grove...@cisco.com
Mobile: 865 724 4910

        
        
        

 


Error! Filename not specified. Think before you print.


This email may contain confidential and privileged material for the sole use
of the intended recipient. Any review, use, distribution or disclosure by
others is strictly prohibited. If you are not the intended recipient (or
authorized to receive for the recipient), please contact the sender by reply
email and delete all copies of this message.

Please  <http://www.cisco.com/web/about/doing_business/legal/cri/index.html>
click here for Company Registration Information.

 

 

From: Ashok Kumar <ashok34...@yahoo.com <mailto:ashok34...@yahoo.com> >
Reply-To: User <user@hive.apache.org <mailto:user@hive.apache.org> >, Ashok
Kumar <ashok34...@yahoo.com <mailto:ashok34...@yahoo.com> >
Date: Friday, December 18, 2015 at 4:01 PM
To: User <user@hive.apache.org <mailto:user@hive.apache.org> >
Subject: The advantages of Hive/Hadoop comnpared to Data Warehouse

 

Gurus,

 

Some analysts keep asking me the advantages of having Hive tables when the
star schema in Data Warehouse (DW) does the same.

 

For example if you have fact and dimensions table in DW and just import them
into Hive via a say SQOOP, what are we going to gain.

 

I keep telling them storage economy and cheap disks, de-normalisation can be
done further etc. However, they are not convinced :(

 

Any additional comments will help my case.

 

Thanks a lot

 

Reply via email to